INTEGRAL, — 
_ MEASURE & DERIVATIVE: 
A UNIFIED APPROACH 

poets GE Shilov & B.L. Gurevich a 
ae | =€ ne 


a 4 


4 


"=e = 


- Revised English Edition Translated & Edited by Richard A. S ‘Iverman 


INTEGRAL. 
MEASURE AND DERIVATIVE: 
A UNIFIED APPROACH 


INTEGRAL, 
MEASURE AND DERIVATIVE: 
A UNIFIED APPROACH 


G. E. SHILOV 
B. L. GUREVICH 


Revised English Edition 
Translated and Edited by 


Richard A. Silverman 


DOVER PUBLICATIONS, INC. 
NEW YORK 


Copyright © 1977 by Dover Publications, Inc. 

Copyright © 1966 by Richard A. Silverman. 

All rights reserved under Pan American and 
International Copyright Conventions. 


Published in Canada by General Publishing Com- 
pany, Ltd., 30 Lesmill Road, Don Mills, Toronto, 
Ontario. 

Published in the United Kingdom by Constable 
and Company, Ltd., 10 Orange Street, London WC2H 
7EG. 


This Dover edition, first published in 1977, is an 
unabridged and corrected republication of the Eng- 
lish translation originally published by Prentice- 
Hall, Inc., in 1966. 


International Standard Book Number: 0-486-63519-8 
Library of Congress Catalog Card Number: 77-75774 


Manufactured in the United States of America 
Dover Publications, Inc. 
180 Varick Street 
New York, N.Y. 10014 


AUTHORS’ PREFACE 


This volume is intended as a textbook for students of 
mathematics and physics, at the graduate or advanced 
undergraduate level. It should also be intelligible to 
readers with a good background in advanced calculus 
and sufficient ““mathematical maturity.” 


The phrase “unified approach” in the title of the book 
refers to the consistent use of the Daniell scheme, which 
Starts from the concept of an elementary integral defined 
(axiomatically) on a family of elementary functions. In 
the Introduction we explain in detail why we prefer 
this approach to others, in particular to the Lebesgue- 
Radon-Fréchet approach, which starts from axiomatic 
measure theory. 


In preparing the American edition, we gave the book a 
complete overhaul. In particular, Chapter 1 was enlarged, 
Part 2 on the Stieltjes integral was totally rewritten, a 
section on Lebesgue-Stieltjes integration in infinite- 
dimensional spaces was added, and the order of presen- 
tation was changed in many places. We take this 
opportunity to thank Dr. R. A. Silverman, who while 
translating the book worked through all the mathematics 
and suggested many important improvements, resulting 
in a simpler treatment in some cases and a deeper one 
in others. 


vi 


AUTHORS’ PREFACE 


The inspiration for much of the material presented here 
stems from three books, all listed in the Bibliography 
on p. 227: Riesz and Nagy’s Functional Analysis which 
treats the Daniell scheme for the case of one or several 
real variables, Loomis’ /ntroduction to Harmonic Analysis 
where the Daniell scheme is presented in a general form 
(somewhat different from ours), and Saks’ Theory of 
the Integral which gives a general method for differenti- 
ating set functions in n-space with respect to a system 
of cubes (the simplest example of a Vitali system). Some 
use has been made of text and problems borrowed from 
Chapter 4 and 6 of the book Mathematical Analysis, 
A Special Course (Moscow, 1961), written by the senior 
author. We would also like to express our gratitude 
to Prof. D. A. Raikov who read the entire book in 
manuscript and made a number of important suggestions. 


G.E.S. 
B.L.G. 


TRANSLATOR’S PREFACE 


The present book differs from most others in the same 
area by approaching its subject from the standpoint of 
the Daniell integral. Concerning the merits of this 
approach, I can do no better than quote from Paley and 
Wiener (Fourier Transforms in the Complex Domain, 
New York, 1934, p. 145): “In an ideal course on Lebesgue 
integration, all theorems would be developed from the 
point of view of the Daniell integral.” As far as I know, 
there is no place else in the textbook literature where the 
Daniell scheme has been pursued with full generality, 
even to the point of including a complete theory of 
differentiation. 


During the course of the translation, I had the benefit 
of the authors’ unstinting cooperation, which I take 
this occasion to gratefully acknowledge. They in turn 
had the opportunity of examining the translation in 
manuscript and conferring upon it their “seal of ap- 
proval.” The Bibliography was prepared expressly for 
this edition, and is confined to books available in English. 
Sections marked with asterisks relate to certain side 
issues and can safely be omitted without loss of 


continuity. 
R.A. S. 


vil 


PART 1 


CONTENTS 


INTRODUCTION, Page 1. 


THE INTEGRAL, Page 5. 


THE RIEMANN INTEGRAL AND STEP 
FUNCTIONS, Page 7. 


1.1. 
1.2: 
1.3. 
1.4. 


1.5. 
1.6. 


ab Ber 


2.1. 


The Riemann Integral, 7. 

Lower and Upper Integrals, 8. 

Step Functions, 11. 

Sets of Measure Zero and Sets of Full 
Measure, 13. 

Further Properties of Step Functions, 15. 
Application to the Theory of the Riemann 
Integral, 17. 

Invariant Definition of Lower and Upper 
Functions. Lebesgue’s Criterion for Riemann 
Integrability, 18. 

Generalization of the Riemann Integral: The 
Key Idea, 20. 

Problems, 21. 


GENERAL THEORY OF THE INTEGRAL, 
Page 23. 


Elementary Functions and the Elementary 
Integral, 23. 


ix 


x 


CONTENTS 


2.10. 
2.11. 


Sets of Measure Zero and Sets of Full 

Measure, 24. 

The Class L*. Integration in L*, 26. 

Properties of the Integral in the Class L", 28. 

The Class L. Integration in LZ, 29. 

Levi’s Theorem, 32. 

Lebesgue’s Theorem, 34. 

Summability of Almost-Everywhere Limits, 

36. 

2.8.1. Measurable functions, 36. 

2.8.2. Fatou’s lemma, 37. 

Completeness of the Space L. The Riesz- 

Fischer Theorem, 38. 

Fubini’s Theorem, 40. 

Integrals of Variable Sign, 44. 

2.11.1. Riesz’s representation theorem, 44. 

2.11.2. Construction of a space of summable 
functions for the functional J, 47. 

2.11.3. Other representations of J. The 
canonical representation, 48. 


THE LEBESGUE INTEGRAL IN n-SPACE, 
Page 50. 


Seb: 


52. 


3.3. 


3.4. 


Relation between the Riemann Integral and 
the Lebesgue Integral, 50. 

Improper Riemann Integrals and the 
Lebesgue Integral, 51. 

Fubini’s Theorem for Functions of Several 
Real Variables, 52. 

Continuous Functions as Elementary Func- 
tions, with the Riemann Integral as Ele- 
mentary Integral, 54. 

Problems, 56. 


PART 2. THE STIELTJES INTEGRAL, Page 59. 


THE RIEMANN-STIELTJES INTEGRAL, 
Page 61. 


4.1. 
4.2. 


Blocks and Sheets, 61. 
Quasi- Volumes, 63. 


4.3. 


4.4. 


4.5. 


*4.6, 


4.7. 


CONTENTS Xi 


Quasi-Length and the Generating Function, 

64. 

The Riemann-Stieltjes Integral and Its 

Properties, 66. 

4.4.1. Construction of the Riemann-Stieltjes 
integral, 66. 

4.4.2. Further properties, 68. 

4.4.3. The case of infinite B, 69. 

4.4.4. Equivalent quasi-volumes: a preview, 
71. 

Essential Convergence. The Helly Theorems, 

72. 

Applications to Analysis, 75. 

4.6.1. Herglotz’s theorem, 75. 

4.6.2. Bernstein’s theorem, 77. 

4.6.3. The Bochner-Khinchin theorem, 80. 

Structure of Signed Quasi-Volumes, 81. 

4.7.1. Representation of a signed quasi- 
volume co as the difference between two 
nonnegative quasi-volumes, 81. 

4.7.2. Other representations of o. The canon- 
ical representation, 82. 

4.7.3. Formulas for the positive, negative 
and total variations, 83. 

4.7.4. Thecasen = 1. Jordan’s theorem, 84. 

Problems, 86. 


THE LEBESGUE-STIELTJES INTEGRAL, 


Page 88. 

5.1. Definition of the Lebesgue-Stieltjes Integral, 
88. 

5.2. Examples, 89. 

5.3. The Lebesgue-Stieltjes Integral with Respect 
to a Signed Quasi-Volume, 93. 

5.4. The General Continuous Linear Functional 
on the Space C(B), 94. 

5.5. Relation between the Quasi-Volumes o and o, 
96. 

5.6. Continuous Quasi-Volumes, 99. 

5.7. Equivalent Quasi-Volumes, 103. 

5.8. Construction of the Lebesgue-Stieltjes Inte- 


gral with Step Functions, as Elementary 
Functions, 105. 
Problems, 107. 


xii CONTENTS 


PART 3 MEASURE, Page 111. 


MEASURABLE SETS AND GENERAL 
MEASURE THEORY, Page 113. 


6.1. 
6.2. 
6.3. 
6.4. 
6.5. 


6.6. 
6.7. 


6.8. 
*6.9. 


More on Measurable Functions, 113. 
Measurable Sets, 116. 

Countable Additivity of Measure, 117. 
Stone's Axioms, 119. 

Characterization of Measurable Functions 
in Terms of Measure, 120. 

The Lebesgue Integral as Defined by 
Lebesgue, 121. 

Integration over a Measurable Subset, 123. 
Measure on a Product Space, 125. 

The Space L,, 126. 

Problems, 131. 


CONSTRUCTIVE MEASURE THEORY, 


Page 134. 

7.1. Semirings of Subsets, 134. 

7.2. The Subspace Generated by a Semiring of 
Summable Sets, 136. 

7.3. Sufficient Semirings, 136. 

7.4. Completely Sufficient Semirings, 139. 

7.5. Outer Measure and the Measurability Cri- 
terion, 141. 

7.6. Measure Theory in n-Space. Examples, 143. 

7.7. Lebesgue Measure for 2 = 1. Inner Measure, 


147. 
Problems, 148. 


6 AXIOMATIC MEASURE THEORY, Page 150. 


8.1. 


8.2. 


8.3. 


8.4. 
8.5. 


Elementary, Borel and Lebesgue Measures, 
150. 

Lebesgue and Borel Extensions of an Ele- 
mentary Measure, 153. 

Construction of the Integral from a Lebesgue 
Measure, 158. 

Signed Borel Measures, 159. 

Quasi-Volumes and Measure Theory, 162. 


8.6. 
*8.7. 


*8.8. 


CONTENTS Xiii 


The Hahn Decomposition, 163. 

The General Continuous Linear Functional! 

on the Space C(X), 166. 

The Lebesgue-Stieltjes Integral on an Infinite- 

Dimensional Space, 167. 

8.8.1. Cylinder sets, blocks and _ quasi- 
volumes. Extensions and projections, 
168. 

8.8.2. Construction of the space L,,(X). 
Kolmogorov’s theorem, 171. 

8.8.3. Structure of w-measurable sets and 
functions, 173. 
Problems, 178. 


PART 4 THE DERIVATIVE, Page 181. 


Q MEASURE AND SET FUNCTIONS, Page 183. 


9.1 


9.2. 


*9.33 


9.4. 


9.5. 


9.6. 


Classification of Set Functions. Decomposi- 

tion into Continuous and Discrete Com- 

ponents, 183. 

Decomposition of a Continuous Set Function 

into Absolutely Continuous and Singular 

Components. The Radon-Nikodym Theorem, 

187. 

Some Consequences of the Radon-Nikodym 

Theorem, 190. 

9.3.1. The general continuous linear func- 
tional on the space L(Y), 190. 

9.3.2. The general continuous linear func- 
tional on the space L,(X’), 192. 

Positive, Negative and Total Variations of 

the Sum of Two Set Functions, 194. 

The Case X = [a, 6). Absolutely Continuous 

Point Functions, 196. 

The Lebesgue Decomposition, 199. 

Problems, 203. 


10 THE DERIVATIVE OF A SET FUNCTION, 
Page 205. 


10.1. 


10.2. 


Preliminaries. Various Definitions of the 
Derivative, 205. 
Differentiation with Respect to a Net, 208. 


xiV CONTENTS 


10.3. Differentiation with Respect to a Vitali 
System. The Lebesgue-Vitali Theorem, 209. 
10.4. Some Consequences of the Lebesgue-Vitali 
Theorem, 215. 
10.4.1. De Possel’s theorem, 215. 
10.4.2. Lebesgue’s theorem on differentiation 
of a function of bounded variation, 
216. 
10.5. Differentiation with Respect to the Under- 
lying o-Ring, 220. 
Problems, 223. 


BIBLIOGRAPHY, Page 227. 


INDEX, Page 229. 


INTRODUCTION 


One of the basic concepts of analysis is that of the integral. The classical 
theory of integration, perfected in the middle of the last century by Cauchy 
and Riemann, is entirely adequate for solving many mathematical problems, 
both pure and applied. However, it does not meet the needs of a number 
of important branches of mathematics and physics of comparatively recent 
vintage, being deficient in at least three respects: 


1. As classically defined, the integral applies only to functions of one 
or several variables, whereas nowadays one must be able to integrate 
over sets which cannot be described by a finite number of real param- 
eters. This necessity arises, for example, in investigations ranging 
from probability theory and partial differential equations to hydro- 
dynamics and quantum mechanics. 


2. Even in the case of finitely many variables, only “‘relatively few” 
functions (e.g., those that are continuous, piecewise continuous or 
satisfy other rather strong requirements) can be integrated by using 
Riemann’s classical definition of the integral. Some indication of 
the smallness of the class of Riemann-integrable functions is shown by 
the following fact: It is an easy matter to construct a sequence of 
functions {f,(x)} on the interval a < x < 6, say, which satisfies the 
Cauchy convergence criterion “in the mean,”’ in the sense that 


lim ["Ufm(2) — fuQOI dx = 0, 


Mm.N—> DO 


2 INTRODUCTION 


without the sequence having a limit function which is Riemann 
integrable. This “lack of completeness” of the class of Riemann- 
integrable functions is a grave drawback, since completeness 1s 
well-nigh indispensable in any branch of modern analysis. 


3. In the classical theory, the domain of integration X (e.g., a line or a 
plane) is ‘““homogeneous”’ in the sense that the values of integrals over 
X do not change if the integrands are shifted. However, there are many 
problems where X can no longer be regarded as homogeneous. One 
can often take account of this lack of homogeneity by introducing a 
variable density, as is done in problems involving the vibration of 
inhomogeneous strings. But this device entails certain difficulties. 
For example, how should one define the density of a string loaded 
with point masses ? 


The above remarks amply illustrate the inadequacy of the classical theory 
of integration. All these difficulties disappear in the modern theory of 
integration, developed by some of the leading mathematicians of our time, 
from Lebesgue to the present day. The new theory does not require the domain 
X to be either finite-dimensional or homogeneous, and leads to a “‘sufficiently 
large” class of integrable functions, in particular, a class which is complete 
relative to convergence in the mean. 

In presenting the general theory of the integral, we have chosen the 
Daniell method as our basic approach. This method gets to the crux of the 
matter more quickly and directly than the original method of Lebesgue, 
since it 1s not based on preliminary construction of a theory of measure. 
Moreover, from the Daniell standpoint, measure theory itself 1s particularly 
simple and natural, appearing as an almost self-evident consequence of the 
theory of the integral. In this regard, it should be pointed out that the 
Lebesgue and Daniell constructions of the integral are equivalent if finite- 
valued (“‘step’’) functions are chosen as the elementary functions. However, 
there are cases where functions other than step functions should be chosen 
as the elementary functions (e.g., in studying linear functionals on the space 
of continuous functions defined on a compact metric space), and then the 
Daniell method is effective while the Lebesgue method is not. 

Having made these preliminary observations, we now give a brief sketch 
of the contents of the book. Part 1 is devoted to the integral, and consists 
of three chapters. In the first, we define the Riemann integral for a continuous 
function of » variables as the limit of a sequence of “lower Darboux sums,” 
or, what amounts to the same thing, a nondecreasing sequence of step func- 
tions. This approach has the merit of pointing the way to further generaliza- 
tion, by axiomatization of certain special properties of integrals of step 
functions. The most basic of these properties is “‘upper continuity,” Le., 
if a nonincreasing sequence of step functions converges to zero, then so do 


INTRODUCTION 3 


their integrals. This generalization is carried out in Chap. 2, starting from a 
family of “elementary functions” defined on an arbitrary set X and equipped 
with an “elementary integral’’ which satisfies the axioms suggested by corre- 
sponding properties of integrals of step functions. The family of elementary 
functions is then enlarged by taking monotonic passages to the limit and 
forming differences. The result is a space of “‘summable functions,” which 
is complete relative to the “natural” norm based on the new definition of the 
integral. Finally, in Chap. 3, we apply the general theory to functions of n 
real variables, thereby obtaining the ‘‘classical’’ Lebesgue integral. 

In Part 2, we consider the Stieltjes integral, corresponding to the case 
where X is inhomogeneous. Chapter 4 is concerned with the Riemann- 
Stieltjes integral in n-space, constructed from a “quasi-volume”’ (i.e., an 
additive function of n-dimensional parallelepipeds, called “‘blocks’’). Here 
we digress to indicate some applications of Stieltjes integration to classical 
analysis, based on the use of the Helly limit theorems. In Chap. 5 the Daniell 
scheme (described in Chap. 2) is used to construct the Lebesgue-Stieltjes 
integral in n-space, starting from continuous functions as elementary func- 
tions and the Riemann-Stieltjes integral as elementary integral. One can also 
start from step functions as elementary functions, as in the construction of 
the ordinary Lebesgue integral, but then an extra requirement of ‘‘upper 
continuity” must be imposed on the quasi-volume. However, this causes 
no trouble, since every quasi-volume o of bounded variation is equivalent 
to an upper continuous quasi-volume a, in the sense that Riemann-Stieltjes 
integrals of continuous functions have the same values with respect to both 
o and oa. 

Jn Part 3 the general Daniell scheme is used to develop a theory of measure. 
We start in Chap. 6 with a family of elementary functions defined on an 
arbitrary set X, equipped with an integral satisfying the conditions stipulated 
in Chap. 2. A function on X is said to be “measurable” if it is the limit of a 
sequence of elementary functions in the sense of convergence ‘‘almost 
everywhere.” In particular, every summable function is measurable. A subset 
E < X is said to be measurable if its characteristic function y (x) 1s measur- 
able and summable if y;(x) 1s summable. In the latter case, the “‘measure”’ 
of E is defined as the integral of 7,-(x). It follows at once from earlier con- 
siderations that measure is “countably additive.”’ Then we give an alternative 
definition of the integral of a summable function f(x), based on Lebesgue’s 
original approach, in terms of the measures of the sets on which f(x) takes 
values lying in given intervals. Chapters 7 and 8 are devoted to a deeper 
study of measure theory. The first of these chapters explores constructive 
measure theory, where general measurable sets are approximated by count- 
able unions and intersections of particularly simple measurable sets (blocks 
in the case where X is n-space); the second deals with axiomatic measure 
theory, where a theory of the integral is constructed from a postulated 


4 INTRODUCTION 


‘elementary measure’ which is susceptible to various “‘extensions.”’ Here 
again, consistent use of the Daniell scheme leads to great simplifications, 
and the two approaches, axiomatization of the integral (Chap. 2) and 
axiomatization of measure (Chap. 8) finally blend into a single theory. We 
conclude Part 3 with an introduction to the theory of Lebesgue-Stieltjes 
integration in infinite-dimensional spaces, a topic of great current interest. 

The last part of the book (Part 4) is devoted to the theory of the derivative. 
In Chap. 9 we consider two countably additive set functions defined on the 
same abstract set X. one of which its still called a measure since it 1s non- 
negative. For the other set function. which ts in general “signed” (i.e.. which 
takes values of either sign), we establish a canonical decomposition (relative 
to the measure) into a discrete component and a continuous component, 
afterwards decomposing the continuous component in turn into a singular 
component and an absolutely continuous component A(£). It turns out that 
A(E) is the integral over E of a summable function g(x), called the “density” 
of A(E) (this is the celebrated Radon-Nikodym theorem). Particularizing 
the theory to the case of functions of one variable, we obtain the classical 
Lebesgue decomposition of an arbitrary (point) function of bounded variation 
into the sum of three terms, t.e., a discrete component, a singular component 
and an absolutely continuous component. The problem of finding the density 
g(x) is examined in Chap. 10. This leads to the operation of differentiation, 
which we first study for the case where X is an interval a <x < b. We 
consider three different ways of defining the derivative, one based on special 
intervals (with binary rational end points), another on arbitrary intervals, 
and a third on arbitrary Borel sets. Each of these three definitions can be 
generalized to the case where X 1s an arbitrary set, equipped with a Borel 
measure. The first corresponds to differentiation with respect to a “net,” 
the second to differentiation with respect to a “Vitali system,” and the third 
to differentiation with respect to the class of all Borel subsets of X. In each 
case we prove that the derivative exists and equals the density g(x) almost 
everywhere. Finally. as special cases. we prove de Possel’s theorem on 
differentiation with respect to a net and Lebesgue’s theorem on differentiation 
of a function of bounded variation. 


Part l 


THE INTEGRAL 


THE RIEMANN INTEGRAL 
AND STEP FUNCTIONS 


I.1. The Riemann Integral 


By an n-dimensional rectangular parallelepiped we mean a set of points 
x =(x,,...,X,) of the form 


Ba {xa = Xe Dy any = Ky = Oz}, 
where, naturally, it is assumed that 
Di Dye wet De 


For brevity, such parallelepipeds will henceforth be called “‘blocks.’’ The 
largest of the numbers 5, — a,,...,5, — a, will be called the size of the 
block B, and the quantity 


5(B) = (0, — a,)+++ (6, — @,) 


will be called the volume of the block. The function s(B) is an additive function 
of its argument, in the sense that if the block B is divided into subblocks 
B,,..., B, with no interior points in common (such subblocks are said to 
form a partition of B), then 


s(B) = s(B,) + +++ + s(B,). 


A block which is fixed during the course of a given discussion will be called 
the “basic block,’’ denoted by boldface B. 

We now recall how Riemann integrals are constructed. Let f(x) be a 
bounded real function defined in a basic block B. Let II be a partition of B 


7 


8 THE RIEMANN INTEGRAL AND STEP FUNCTIONS CHAP. | 


into subblocks B,..... B,. and in each block & choose an arbitrary point 
&, (AK =1.....,p). Then form the Riesann sum 


Ra(S) = 5 fEdsB) 


Let dl) denote the largest size of the blocks By... .. Band: let thas sce, 
I],.... be a sequence of partitions such that dl) > Q. If the sequence 


Ri (f) has a limit as g—-» oc, which is independent of the choice of the 
sequence I], [provided only that d(H.) +0] or of the points 2,0 By. then 
the limit is called the Riemann integral of the function f(V) [over the block B}. 
and we write 


[,fddx = lim Ralf). (1) 


d(M)0 

One would now like to know the class of functions fuv) for which this 
limit exists. In his Cours d Analyse (S21), Cauchy proved that the integral (1) 
exists if /(V) is continuous.’ By 1837 Dirichlet had already observed that there 
are (discontinuous) functions which are not Riemann integrable.* Some time 
later, necessary and sufficient conditions for a function f(y) to have a Riemann 
integral were found bv Riemann, du Bois-Revmond and Lebesgue. In every 
case, it turns out that a Riemann-integrable function cannot be “toe dis- 
continuous.” (Lebesgue’s eriterion for Riemann integrability will be given 
in Sec. 1.7.) Subsequently, Various requirements of the theory led to a search 
for more general definitions of the integral, appheable to a much wider 
class of functions. The most important such definition was given by Lebesgue 
In 1902 (fora Land later by Radon and Freehet in the period 1912-1915 
(for the general case). The construction of the Lebesgue integral can be 
approached in a variety of wavs. For the reasons given in the Introduction, 
We choose the approach due to Daniell (L9TS8). But first we must sav more 
about Riemann integrals. 


1.2. Lower and Upper Integrals 


Let I] be a partition of the block B into subblocks By... 2. B,. and let 
m, = inf f(x), M, = sup f(x) (A= 1,...,p). 


re B, re By 


‘Cauchy's proot cannot be considered rigorous, since the concept of uniform continuity 
was not at his disposal. The first rigorous proof of the existence of the Riemann integra of a 
continuous function was given bv Darbouy in E875, The definition of uniform continuity 
and the theorem concerning the uniform continuity of a function defined on a closed 
interval is due to Heine (1870). 

> For example, consider the Dirichler function avy 0s Vs Lo equal to 0 for irrational 
wand to 1 for rational v. Then, given any partition UL. we can make &y Ce) equal to either 
O or 1, by choosing the numbers 2. to be either all irrational or all ratvienal. 


SEC. 1,2 THE RIEMANN INTEGRAL AND STEP FUNCTIONS 9 


Then the expression 
D 
Dy(f) = > m,s(B,) 
k=1 


is called the lower Darboux sum of f(x), corresponding to the partition IT. 
Similarly, 
Da(f) =X Mis(By) 
k=1 
is called the upper Darboux sum of f(x). Obviously, for any choice of the 
points &, € B, (k = 1,..., p), we have 


ms(B) < Dy(f) < Ru(f) < Da(f) < Ms(B), 
where 
m = inf f(x), M = sup f(x). 
; 2eB 2eB 
We now compare the values of the lower and upper sums for two different 
partitions II and II’ of the same basic block B. First suppose [I’ is obtained 
by further subdividing the blocks of the partition IT (in which case, II’ is 
called a refinement of II). Then every term m,s(B,) of the sum Dy is replaced 
by a sum of the form 
> m,,,5(B;,,), 
y] 
where 
B,, = U B,,,, My, — inf F(x). 
j 


ré By; 
Since m, < m,., 


m,S(B;,) = m, > 5(B;,,) < > m,,5(B,,), 
y] y) 
and hence 


Daf) = > m,5(B,) < > pa m,,,S(B,,) = Dy(f). 


Thus, in going from the partition II to the ‘‘finer’’ partition II’, the lower 
Darboux sum can only increase. Similarly, in going from IT to II’, the upper 
Darboux sum can only decrease. 

Next let If and II’ be arbitrary partitions. Then the set of all intersections 
of blocks of II with blocks of II’ forms a new partition [1", which is a 
refinement of both II and I’. But then, as just shown, 


Di(f) < Datf) < Daf) < Dal), (2) 


i.e., a lower Darboux sum can never exceed an upper Darboux sum. Suppose 
we write _ 


supDa(f) =|, f) 4x, int Daf) =], FO dx, 


where the supremum and infimum are taken with respect to a// partitions 


10 THE RIEMANN INTEGRAL AND STEP FUNCTIONS CHAP, 1 


of the block B. The first of these integrals is called the /Jower (Riemann) 
integral of f(x), and the second is called the upper integral of f(x), both over 
the block B. Then it follows from (2) that 


I f(x) dx <|, fs) dx. 
THeoreM 1. Jf Il,,...,11,,... is a@ sequence of partitions of the 
block B such that d({1,) — 0, then 
lim Dn(f) = }_ f2) dx, (3) 
qo = 
and similarly, _ 
lim Daf) =|, f(s) dx. (4) 


Proof. Given any ¢ > 0, there is a partition II such that 


0< | 4S) dx — Da <5. 


Consider the quantity Dy (f). The blocks of the partition II, fall into 
two groups: The first group, denoted by B{*), consists of blocks which are 
entirely contained in blocks of II, and the second group, denoted by 
B‘, consists of blocks which intersect the boundaries of blocks B, of II. 
Correspondingly, we represent Dy, in the form 


Daf) = > mis(B) + > ms(B). 


Let B/“ denote the intersections of the blocks B‘ with the blocks B,. 
Adding the blocks B;‘) to the blocks B‘”, we obtain a new partition IJ’ 
of the basic block B, which is a refinement of the partition [I]. Conse- 
quently, 


Daf) = Dr, — ¥ mis(Bi) + ¥ m's(B), (5) 
and hence : 


[gf dx —£ < Duff) < Du) <Jy f@) dx 6 


Now let G,; denote the total area of the boundaries of all the blocks 
of the partition I]. Since the blocks Bi and B‘” intersect boundaries 
of the partition I! and have sizes no larger than d(I1,), each of the sums 
in the right-hand side of (5) is no larger than MG,d(II,) in absolute 
value. Choosing g such that 


MG, d(II,) < re 


SEC. 1.3 THE RIEMANN INTEGRAL AND STEP FUNCTIONS [|] 


we Clearly have 


0< | feddx — Dr) < |, fe) ax — Dr +5 <e, 


where we have used (5) and (6). This proves (3), and (4) is proved 

similarly. 

If the Riemann integral of the function f(x) exists, then the upper and 
lower sums must have the same limit, and hence 


lim Dn,(J) =Jyf() ax =f, de =], $60 dx = lim BaD) 

for any sequence of partitions II, such that d(II,) —- 0. Conversely, if there 
is at least one pair of sequences of partitions II,, II; (¢ = 1, 2,...) with 
d(I1,) —> 0, d(II’) — 0, such that 

: lim Dy,(f) = lim Dy (f), 

q7>oa ago : 

then, given any sequence II? with d(I1%) — 0, 

lim Dy-(f) = lim Dy,(f) = lim Dy (f) = lim Dy-(f), 

q7oa q7 ao q7o ‘ aoa 


and hence f(x) is Riemann integrable. 


1.3. Step Functions 


Let 
B=8,U---UB, 
be a partition of the basic block B into subblocks B,,...,B, with no 
interior points in common. Then a function /A(x) taking constant values in 
each of the blocks B,,..., B,, 1.e., such that 


h, forxe B,, 
A(x) ={ - a as. Ag 
h, forxeB,, 


is called a step function. The function A(x) can be defined in various ways 
(or even left undefined) on the boundary planes of the subblocks B,, which 
are planes of discontinuity for h(x); the values of A(x) on these planes will 
not matter in our subsequent considerations. 

The family of all step functions defined on a block B will be denoted by 
H, or if necessary, by H(B). The set H is a linear space with the usual opera- 
tions of addition and multiplication by real numbers. Thus, if A(x) and 
k(x) are step functions, so is the linear combination 


I(x) = ah(x) + Bk(x) 


12 THE RIEMANN INTEGRAL AND STEP FUNCTIONS CHAP. 1 


with real coefficients « and @. In fact, if A(x) 1s constant in the subblocks 
B,,..., B,, while k(x) is constant in the subblocks B;,..., Bj, then /(x) 
is constant in each of the intersections 


B,Bi,..., B,Bi, 


which together constitute a partition of B.? 

The space H 1s closed under operations other than the forming of linear 
combinations. For example, if A(x) is a step function, so is its absolute value 
|A(x)|. Moreover, if A(x) and k(x) are step functions, then so are the functions 


h(x) = max {h(x), k(x)}, k(x) = min {A(x), k(x}. 
In particular, the positive part h'(x) of any step function A(x), defined by 
h*(x) = max {h(x), 0} 
is itself a step function, and so is the negative part h(x), defined by 
h-(x) = max {0, —h(x)}. 


Next we introduce the concept of the integral of a step function A(x).* 
By the integral of h(x) over the block B, we mean the quantity 


th == > h,s(B,). 
k-1 


The integral of a step function has the following two properties: 


a) If A, k are any two step functions and «, 8 are any two real numbers, 


then 
I(ah + 8k) = ath + Bik. 


b) If hk and k are two step functions such that h(x) < k(x) for all x € B, 
then th < Ik. In particular, if h(x) > 0, then Ih > 0. 


To prove Property a, suppose A(x) is constant in the blocks B,,..., B,, 
while k(x) is constant in the blocks B),..., Bj. Then both functions are 
constant in the blocks B,B;,..., B,B,, and moreover 


s(B;) = > 5(B,B;), —_s(B,) = > 3(B;B)). 


It follows that 
Ih = > hs(B) = > > hs(B.B)), 


Ik = > k,s(B)) = > > k;s(B,B)), 
J J 2 
’ Naturally, any of these intersections which is empty or degenerate (with no interior 


points) can be omitted. 
! This concept is distinct from the previously introduced notion of a Riemann integral, 


and hence will be denoted by a different symbol. 


SEC. 1.4 THE RIEMANN INTEGRAL AND STEP FUNCTIONS {3 


and hence 


I(ah + Bk) = YY (ah, + Bk,)s(B,B}) = ath + BIh, 


as asserted. Property b is proved similarly. 


1.4. Sets of Measure Zero and Sets of Full Measure 


In what follows, an important role will be played by coverings of sets by 
collections of blocks. We say that a set E (in the basic block B) is covered 
by a collection of blocks {B,} if every point of E is an interior point of at 
least one block B,. If E is closed, we have the finite subcovering lemma 
(a variant of the familiar Heine-Borel theorem): From every collection of 
blocks {B,} covering a closed set E < B, we can select a finite subcollection 


covering Fa 


DEFINITION. A set Z < Bis called a set of measure zero if given any 
¢ > 0, there exists a countable (i.e., a finite or countably infinite) subcollec- 
tion of blocks B,, B,,... covering Z such that the sum of the volumes of 
B,, B,,... is less than «. The empty set will also be regarded as a set of 
measure zero. 


Thus a sheet, 1.e., the intersection of B with some hyperplane of dimension 
n — 1 parallel to a coordinate hyperplane, is a set of measure zero, since, 
for any < > 0, there is a block B, © B containing the given sheet whose 
volume is less than < (we need only choose B, to have sufficiently small 
thickness). On the other hand, the whole block B ts certainly not a set of 
measure zero. In fact, suppose By, By,... 1s a covering of B. By the finite 
subcovering lemma, we can select from B,, Bz... a finite subcollection 
which also covers B. But then the sum of the volumes of even this finite 
number of blocks must exceed the volume s(B), and hence cannot be less 
than « < s(B). 

It is easy to see that the union of a countable collection of sets of measure 
zero is itself a set of measure zero. In fact, if Z,,...,2Z,,... are sets of 
measure zero, then, given any « > 0, we cover every set Z,, by a countable 
collection of blocks, the sum of whose volumes is less than </2’”. As a result, 
the whole set Z = Z, UZ, U--** is covered by a countable collection of 
blocks,® the sum of whose volumes is less than ¢. Therefore Z is a set of 
measure zero, as asserted. 

A set E < B is said to be a set of full measure, if its complement GE 
(relative to B) is a set of measure zero. The intersection of a countable collection 


> Here we use the fact that the union of countably many countable sets is countable. The 
sequence Z;,...,2Zm,-.+- May terminate. 


14. THE RIEMANN INTEGRAL AND STEP FUNCTIONS CHAP. 1 


of sets of full measure is itself a set of full measure. In fact, if E,, E,,... are 
sets of full measure and if Z, = @E£,, Z. = @E,, ... are the corresponding 
sets of measure zero, then, as just shown, the set 


@NVE, = U E,, = U Z,, 


has measure zero. Hence f) E,,, is a set of full measure, as asserted. 

If a given property holds at every point of a set of full measure in the 
block B, we say that the property holds for almost all points of B (or almost 
everywhere in B). There are functions which are continuous almost every- 
where, 1.e., continuous except on a set of measure zero. Similarly, in the class 
of functions that are allowed to take infinite values, there are functions which 
are finite almost everywhere, t.e., finite except on a set of measure zero. The 
set of discontinuity points of a step function has measure zero, consisting 
as it does of a finite number of sheets. By the same token, the set of continuity 
points of a step function is a set of full measure. 

The following theorem can be used to give another definition of a set of 
measure zero, in terms of integrals of step functions: 


THEOREM 2. A setZ © Bisa set of measure zero if and only if, given 
any <« > 0, there exists a nondecreasing sequence of nonnegative step 
functions 


NOs en (aes (7) 
such that 
Ih <e for every m= 1,2,... (8) 
and 
sup h(x) > 1 for every x € Z. (9) 


m 


Proof. If Z is a set of measure zero, then, given any < > 0, there 
exists a collection of blocks B,,....8,,,... with total volume less than 
< which covers the set Z. Let /'*(x) denote the step function which equals 
1 in the blocks B,,..., B,, and 0 outside these blocks. Then the sequence 
of step functions At?(x),..., AG*(x),... obviously satisfies (7) and (8). 
Moreover, any point x,€ Z belongs to some block B,,, and hence 
h(x 9) = 1. But this implies (9), as required. 

Conversely, suppose the properties (7), (8), (9) hold, and let B,,..., 
B,, be the collection of blocks in which the function h{*)(x) takes values 
>4. Then the function AS (x) also takes values > 3 in the blocks B,, 

tsi Dinas and in certain blocks B, 41s ..., B, as well. Similarly, the func- 
tion A{(x) takes values > } in the blocks Beck B,,, and also in certain 
blocks B,,...-.-. B,,. Continuing this argument, we obtain an infinite 
collection of blocks 8....,B,,....B,..... with no interior points 


in common. Because of (9), the set Z is contained in the union of all 


m? 


SEC. 1.5 THE RIEMANN INTEGRAL AND STEP FUNCTIONS 1/5 


the blocks B;. We now calculate the sum of the volumes of the blocks 
B,. Considering only the blocks B,,..., B, in which A®(x) takes values 
greater than 4, and using (8), we have 


If we take the limit as m— oo, this gives 


y s(B,;) < 2e. 


j=1 

The blocks B, may not cover the set Z, since points of Z need not be 
interior points of the blocks B,;. However, if we replace every block B, 
by a concentric block B; with twice the volume of B,, we get a covering 
of Z by blocks B; with total volume <4e. Since ¢< is arbitrary, Z is a 
set of measure zero, and the proof is complete. 


4 
COROLLARY. Given a set Z < B, suppose that for every ¢ > 0, there 
exists a step function h(x) > 0 such that Ih®(x) << and h(x) > 1 
on Z. Then Z is a set of measure zero. 


Proof. We need merely write 


AO(x) = P(x) ae ++ = AMX). 


1.5. Further Properties of Step Functions 


We now prove two important lemmas: 


Lemma |. Jf a sequence of nonnegative step functions h\(x),..., 


h,(x), ... is nonincreasing,® and if lim Ih, = 0, then 
p--» 
limh,(x) = 0 


pro 


almost everywhere. 


Proof. The function 
g(x) = lim h,(x), 


pre 


defined everywhere in the block B, is nonnegative, and the set 
G = {x: g(x) > 0} 
is the union of the sequence of sets 


Go be: g(x) > =} 


Sle, ifA(x) >: SP A(x) Ss: 


16 THE RIEMANN INTEGRAL AND STEP FUNCTIONS CHAP. 1 


Therefore, to show that G is a set of measure zero, it is sufficient to show 
that every G,, is a set of measure zero. But on every G,, we have 


h(x) > g(x) > —. 


and hence 
mh,(x) > 1 (ee beeen Fe 


The function mh,(x) is a nonnegative step function and 
I(mh,) = mlh, — 0 


as p—» oo. Therefore, given any <« > 0, we can always find a p such that 
Iunh,) < ¢. The fact that G,, is a set of measure zero now follows from 
the corollary at the end of Sec. 1.4. 


LeMMA 2. Jf a sequence of nonnegative step functions h(x), ..., 


A(x), ... is nonincreasing, and if lim h(x) -0 almost everywhere, 
then ae 

lim lh, = 0. 

pro 


Proof. First suppose 4,(x) converges to zero everywhere, and let 
Z denote the set of discontinuity points of all the functions h,. Clearly, 
Z is a set of measure zero. Given any < >> 0, we cover Z with a collection 
of blocks B,. B,,... whose total volume is less than ¢. With each of the 
remaining points x° we associate an integer m — m(x’) such that 
h(x’) < ¢ and a block B(x’) containing x’ such that /,, has a constant 
value in B(x’). Together, the blocks B,, B,, ... and the blocks B’(x’) 
form a covering of the basic block B, from which we can select a finite 
subcovering. whose blocks will be denoted by B,,..., B, By, ..., Bi. 
Let p be the largest of the integers associated with the corresponding 
points x,,..., ¥j. Then the function /4,(x) and all step functions with 
higher indices do not exceed ¢ in the blocks By, ..., Bj. Moreover, in 
the blocks B,,.... B,. whose total volume is less than ¢ by construction, 
h(x) does not exceed M,, the maximum of /,(x) on B. [t can be assumed 
that no two of blocks B;,..., B; have interior points in common (this 
canalways be achieved by going over toa finer collection of blocks and then 
excluding shared parts of blocks), and therefore the sum of the volumes 
of the blocks B;,..., B; can be regarded as no larger than the volume 
of the basic block B itself. Hence, for the integral of the function /,(x) 
over the block B and for the integral of any step function with a higher 
index, we have the estimate 


Ih, < Myce + es(B). 


Since < can be chosen arbitrarily small, it follows that /h, 0, as asserted. 


SEC. 1.6 THE RIEMANN INTEGRAL AND STEP FUNCTIONS 17 


Now suppose h,(x) does not converge to zero everywhere, but only 
almost everywhere. Consider the set Z of measure zero on which the 
sequence h,(x) fails to approach zero. According to Theorem 2, Sec. 1.4, 
given any < > 0, there is a nonincreasing sequence of nonnegative step 
functions k(x) such that 

sup k,(x) > 1 
Dp 
for every x € Z and 
ko 
1 
for every p = 1, 2,.... Moreover, the limits 
limIh,>0,  limIk, <— 
po Dp 1 
obviously exist, while the difference 4, — M,k, is nonincreasing and 
has a nonpositive limit everywhere. Therefore, by the first part of the 


proof, 
I(h, — M,k,) < Kh, — M,k,)* — 0, 


and hence 
lim Ih, — M,limIk, = limI(h, — M,k,) < 0. 
But then “— = 
0 <limIh, < M,limIk, < M,— =e. 
pa pa M, 

Since ¢ is arbitrary, 

lim Ih, = 0, 

poo 


and the lemma is proved. 


1.6. Application to the Theory of the Riemann Integral 


The lower Darboux sum (see Sec. 1.2) 
Dy(f) = > m,s(B;,) 
k=1 


can be interpreted as the integral of a “‘lower’’ step function hy,(x), taking 
the value m, in the block B,. Similarly, the upper Darboux sum 


Daf) = > M,s(B,) 


is the integral of an “upper’’ step function h(x), taking the value M, in B,. 
In this way, a sequence of partitions II,,..., II,, ... of the block B gives rise 
to a sequence of lower step functions h,(x),..., h(x), ... and a sequence of 


upper step functions h,(x), ..., A(x), ... Moreover, if every partition IT,,, 


18 THE RIEMANN INTEGRAL AND STEP FUNCTIONS CHAP. 1 


is a refinement of its predecessor I],, then the sequence of lower step 
functions /,(x) 1s nondecreasing and the sequence of upper step functions Is 
nonincreasing. Assuming that d(I!,) —> 0, we introduce the /ower function 


S(x) = lim h,(x) 


ate 


and the upper function - 
f(x) = Im 1,(x), 


q7u 


where obviously 


I(x) < f(x) < f(). 


THEOREM 3. The function f(x) is Riemann integrable if and only if 
f(x) and f(x) coincide almost everywhere.’ 


Proof. Suppose f(x) is Riemann integrable. Then 


lim Th, =|, f(x) dx =|, f(x) dx = lim Ih, 

and hence = 

lim {(h, — h,) = 0. 

gq? a 
But the sequence h, — h, is nonincreasing. Therefore, by Lemma 1, 
Sec. 1.5, 

0 = lim (h, — h.) = lim h, — limh, = f — f 
almost everywhere, as asserted. = 
Conversely, suppose f(x) = f(x) almost everywhere, which means 

that - 

lim(h, — h,) = 0 

aoc 
almost everywhere. Then by Lemma 2, 

lim I(h, — h,) = 9, 


and hence _ 
Jpf (0 dx = lim thy = lim Th, =|, f(x) dex. 
i.e., f(x) 1s Riemann integrable (see Sec. 1.2), and the proof is complete. 
*1.7. Invariant Definition of Lower and Upper Functions. 


Lebesgue’s Criterion for Riemann Integrability 


The definition of lower and upper functions given in the preceding 
section depends explicitly on the choice of the sequence of partitions IT,. We 


7 And hence coincide almost everywhere with the function f(x). 


SEC, 1.7 THE RIEMANN INTEGRAL AND STEP FUNCTIONS 19 


now show that lower and upper functions can be defined directly from 
f(x), at least to within a set of measure zero. With this aim, given any xX, € B, 
we write® 


S(%o) =lim f(x), F(X) = lim f(x). (10) 


2720 xu Zo 


Let I], (¢ = 1, 2, ...) be a sequence of partitions of the block B such that 


IT,41 is a refinement of IJ,, and let f(x) and f(x) be the corresponding lower 
and upper functions. Then 


fo =f, fx) =f) (11) 


almost everywhere. In fact, the relations (11) hold at every point x) which 
for arbitrary qg is an interior point of a block B,(x9) of the partition IT,. To 
see this, we observe that, given any < > 0, there is a ball U,(x9) with center 
at the point x) such that x € U,(x9) implies f(x) > f(%o) — e. The block 
BAX) lies in the ball U(x») for sufficiently large g, and hence f(x) > 
Jf (%o) — € for all x € B(x). But then 


h(x) = a0 JQ) > F(X) — €, 


Ba(xo) 


which implies 


f (xo) = lim hg(x0) > f(%0) — (12) 


as well. On the other hand, the block B,(x,) certainly contains points x 
such that 


f(x) < f(%o) + €, 
and hence 
hom) = inf f(x) < fm) + € 
It follows that 
f(x) = lim hyo) < f(x) + ©. (13) 


Comparing (12) and (13), and taking the limit as « — 0, we find that 
f(xo) =f (%0). 


In the same way, it can be shown that 


F(X) = F(X). 


® Thus f(x) is the limit inferior of f(x) as x > xo, Le., the smallest limiting value of f(x) 
asx —> xo. Similarly, f(x) is the limit superior of f(x) as x — Xo, Le., the largest limiting 
value of f(x) as x — x». [We say that c is a limiting value of f(x) as x -+ xp if, given any 
e > Oand any ball U with center x» (synonymously, any neighborhood of xo), there is a 
point x € U such that | f(x) — c| < «€.] 


20 THE RIEMANN INTEGRAL AND STEP FUNCTIONS CHAP. 1 


THEOREM 40 (Lebesgue’s criterion fer Riemann integrability). The 
function f(x) is Riemann integrable if and only if the set of discontinuity 
points of f(x) is of measure zero. 


Proof. Obviously, vy ts a continuity point of f(x) if and only if 


f (Xo) = f'(%o) = fC). 
If f(x) is Riemann integrable, then 


LO) = (9) =f) =f) =f 
almost everywhere, and hence almost every point. ¥ 1s acontinuity point of 
TY). Conversely, if the set of continuity points of fv) is of full measure, 
then 


fix) =f) =f) =f =f 


holds on a set of full measure, and henee f(y) is Riemann integrable. 


1.8. Generalization of the Riemann Integral: The Key Idea 


As we saw in See. 1.6. if a function 7(v) is Riemann integrable, then it ts 
the linut (in the sense of convergence almost everywhere) of a nondecreasing 
sequence of step functions, in fact. of the functions A.Cv): at the same time, 
fad is the limit (in the same sense) of a nonincreasing sequence of step 
functions, in fact, of the funetions 4,01). The converse is also true: Ifa function 
7cvy ts the limit (in the sense of convergence almost everywhere) of sovie 
nondecreasing sequence A.C’) of step functions [not necessarily of the tvpe 
hx] and at the same time the limit of a noninereasing sequence of step 
functions Av), where Aov) 3 FOV) S Oty) everywhere, then f(x) is Riemann 
integrable. To see this, let [, denote a partition of the block Binto subblocks 
In which A.Cv) is constant, and construct the corresponding functions 4,(y). 
Then 

KolXo) < inf fX) = Ay) < fo), 
re Ba(ry) 
where B.cvp) ts the biock of the partition Il. containing xy. Since AyCYa) > (Xe) 
and hence f(xy)» f(xy) almost everywhere. it follows that fiva) five) 
almest everywhere. Simularly, Ties 7(X) almest evervwhere. so that 
7uvvis Riemann integrable. Here. just as in See. 1.6, we have 


[,f) dx = lim Ik, = lim [,. 


ga x gr x 


PROBLEMS THE RIEMANN INTEGRAL AND STEP FUNCTIONS 2] 


Now suppose that all we know about f(x) is that it is the limit, in the 
sense of convergence almost everywhere, of a nondecreasing sequence of 
step functions h(x), where the numerical sequence Jh, has a limit (this only 
requires that the set of numbers /h,, Ih,, ... be bounded). Then the quantity 


If = lim Ih, 
q7>oo 

will be called the “integral’’ of f, a definition which, at the very least, does 
not contradict the definition of the Riemann integral for functions which are 
Riemann integrable. One is immediately led to ask whether the number /f 
depends only on the function f(x), since /f might conceivably depend on the 
choice of the sequence /,(x). Not only is the answer to this question in the 
affirmative, but further development of the new definition leads to a theory 
of the integral which is free of all the difficulties discussed in the Introduction. 
Moreover, and this is a cardinal point, to construct the new theory we need 
no longer take account of the specific nature of the region B or of the functions 
h(x), provided only that the analogues of 4,(x) and their integrals have 
certain general properties like those already established for step functions 
and their integrals over a block B in n-dimensional space. To point up this 
difference, the whole construction in Chap. 2 will be carried out for functions 
defined on an abstract set X. In fact, we shall start from some set H of 
“elementary functions’ A(x) defined on X, assuming that the integrals Jh 
are already known and have certain properties, formulated as axioms. Then 
the class of integrable functions will be enlarged by using the procedure 
already familiar from Sec. 1.6. This whole approach lends great generality 
to the construction of the integral, and permits applications of the most 
diverse sort. 


PROBLEMS 


1. Let F be a closed set obtained by removing a countable collection of disjoint 
open intervals A,,...,4,,... from a closed interval [a, 5), where the sum of 
the lengths of the intervals ING taerns sp Nie nets equals 6 — a. Show that F is of 
measure zero. 


Hint. Fis covered by the finite collection of intervals obtained by removing 
A,,...,A, from [a, 5]. 


2 (The Cantor set). The “‘middle third” of the closed interval [0, 1] is removed, 
i.e., the open interval (3, 3) of length 3. Next the middle thirds of the two 
remaining intervals are removed, i.e., the interval (3, $) is removed from 
[0, 4] and (4, 8) is removed from [3, 1]. Then the middle thirds of each of the 
four intervals (0, 3], (2, $], (3. §] and [§, 1] are removed, and so on ad infinitum. 


The remaining closed set C is called the Cantor set. Prove that 


a) C is of measure zero; b) C has the power of the continuum. 


22 THE RIEMANN INTEGRAL AND STEP FUNCTIONS CHAP. 1 


Hint. a) Use Prob. 1; b) Compare the points of C written in the ternary 
number system with the points of [0, 1] written in the binary number system. 


3. Suppose F is a closed set contained in [a, b] such that the sum of the lengths 
of the intervals ‘‘adjacent to F’’ (i.e., the components of [a, 6] — F) is less than 
b — a. Prove that F is not a set of measure zero. 


Hint. If F were a set of measure zero, the whole interval [a,b] could be 
covered by a finite collection of intervals with total length less than b — a. 


2 


GENERAL THEORY 
OF THE INTEGRAL 


This chapter, in which we carry out the generalization of the integral 
discussed at the end of Sec. 1.8, plays a central role in the whole book. The 
construction given here of a space of summable functions on an arbitrary 
set X, equipped with a given family of elementary functions and a given 
elementary integral, will be the starting point for all subsequent considera- 
tions. 


2.1. Elementary Functions and the Elementary Integral 


Let H be a family of bounded real functions defined on a set X (these 
functions will henceforth be called elementary functions), and suppose H 
satisfies the following axioms: 


a) H is a linear space with the usual operations of addition and 
multiplication by real numbers. 
b) If a function A(x) belongs to H, then so does its absolute value |/(x)|. 


It follows that if A(x) belongs to H, then so does its positive part h*(x) — 
max {h(x), 0} and its negative part A (x) = max (0, —A(x)}, since these 
functions can obviously be written as linear combinations of A(x) and {/A(x)|: 


h*(x) = 3{1h(x)| + ACD}, A(X) = EIAQ)| — A)}. 


Moreover, if two functions A(x) and k(x) belong to the family H, then so do 
23 


24 GENERAL THEORY OF THE INTEGRAL CHAP. 2 


the functions max {h(x), k(x)} and min {h(x), k(x}, since, as is easily 
verified 

max {h(x), k(x)} = (h(x) — k(x)}* + kQ), 

min {h(x), k(x)} = —max {—A(x), —k(x)}. 

Next we assume that every function 4 € H is assigned a real number Jh, 
called the elementary integral of h (over X), which satisfies the following 
axioms: 

1) If h, k are any two functions in A and «, B are any two real numbers, 

then 
I(ah + Bk) = alh + Blk. 
2) Nonnegativity axiom. If h(x) > 0, then Ih > 0. 
3) Continuity axiom. If h,(x) is a nonincreasing sequence of functions 
in H converging to zero for all x € X, then Ih, — 0. 


It follows from Axioms | and 2 that Lh < Ik if h(x) < k(x). In particular, 


Ih < tht < XA), 

Ih > K—|h|) = —I(Al), 
[Zh| < IA) 

for any he H. 


2.2. Sets of Measure Zero and Sets of Full Measure 


Of the two equivalent definitions of sets of measure zero given in Sec. 
1.4, the definition patterned after Theorem 2, p. 14 is the appropriate one 
to follow here: 


DEFINITION. A set Z < X is called a set of measure zero if, given any 
e > 0, there exists a nondecreasing sequence of nonnegative functions 
h(x) € H such that Ih, < « and 


sup h(x) > 1 on Z. 


The empty set will also be regarded as a set of measure zero. 


It is easy to see that the union of a countable collection of sets Z,,..., 
Zn, +. - of measure zero is itself a set of measure zero. In fact, for any « > 0 
and n, there is a nondecreasing sequence of functions h™ € H(p = 1,2,...) 
such that JA) < </2” and sup A'”(x) > 1 on the set Z,. But the sequence 

Dp 


h, = max {h®,..., hi 


SEC, 2.2 GENERAL THEORY OF THE INTEGRAL 25 


is nondecreasing, and moreover 
Pp 
Ih, < > Ih™ <e, 
k=I 


while sup 4,(x) > I on the set Z. Therefore Z is of measure zero, as asserted. 


A set Ec X is said to be a set of full measure if its complement (relative 
to X) is a set of measure zero. By taking complements, we see at once that 
the intersection of a countable collection of sets of full measure is itself a set 
of full measure. 

As usual, if a given property holds at every point of a set of full measure, 
1.e., at every point of X except for a set of measure zero, then we say that 
the property holds for a/most all points of X (or almost everywhere in X). 
For example, a sequence of functions /,(x) € H converges to zero almost 
everywhere if there is a set E of full measure such that 4,(x) converges to 
zero for all x € E. 


LemMa. Jf a_ nonincreasing sequence of nonnegative functions 
h,(x) € H converges to zero almost everywhere, then 


lim Ih, = 0. 


po 


Proof. Let 
M,= sup h,(x), 
TEs 


and let Z be the set of measure zero on which the sequence h, does not 
converge to zero. Then, give any « > 0, there is a nondecreasing nian 
of nonnegative functions k, € H such that /k, < ¢/M, and SUDA (x) > 

on the set Z. The limits 


lim Ih, > 0, lim Ik, < 


po po 1 


obviously exist, while the difference h, — M,k, is nonincreasing and 
has a nonpositive limit everywhere. Therefore, by Axiom 3, 


(hy — Myks) < Why — Myk,)* > 0, 


and hence 
lim Jh, — M,limIk, =limI(h, — M,k,) < 0 
But then . — = 
0 <limIh, < M,limIk, < M;— =«. 
pra p+ a M, 

Since ¢€ is arbitrary, 

lim Ih, = 0, 

pro 


and the lemma is proved. 


26 GENERAL THEORY OF THE INTEGRAL CHAP. 2 


Ifa function h € H is nonzero only ona set of measure zero, then th, 
In fact, applying the lemma to the sequence |/|, [A], ..., we find that 
I(\h|) = 0 and hence 


hl < K(hl) = 0. 


Therefore, if two functions he H and k © H differ only on a set of measure 
zero, then Ih = Ik. 

The last result can be used to strengthen the lemma somewhat, te.. the 
conclusion of the lemma remains true (/h, +0) even if the sequence /,, 
which converges to zero almost everywhere, is nonincreasing only almost 
everywhere. In fact, replacing A, by hh) min(A,. An). fg by Ay min (/,. 713). 
and so on, we alter the functions of our sequence only on a set of measure 
zero, which has no effect on their integrals. But then we get a sequence 
which Is nonincreasing everywhere and convergent to zero almost everywhere, 
and the lemma applies in its original form. 

The symbol ~ will be used in connection with nondecreasing numerical 
sequences and also with sequences of functions which are nondecreasing on 
a set of full measure. Thus /,(x). ‘ f(v) means that the sequence of functions 
h,(x) is nondecreasing and convergent to f(x) on a set of full measure. The 
symbol] “\ is interpreted similarly. 


2.3. The Class L+. Integration in L* 


We now introduce a class of functions, denoted by 7. (VY). or simply by 
Lr. A function f(x) [which may take infinite values] is said to belong to 1.’ 
if there exists a sequence of functions /,(x) © H such thath, ~° f. where the 
set of integrals Lh,, Ih,, ... 1s bounded, i.e., 


h,<C (n=1,2,...). (1) 


First we show that every function f(x) L* is actually finite almost every- 
where. Let Z < X be the set of points where f(y) — --«. It can be assuined 
that the functions /,(x) are nonnegative. since otherwise we need only 
replace f(x) by A,(x) — A,(x). Discarding a set of measure zero, If necessary, 
we can assume that the sequence /7,(v) is nondecreasing and convergent te 
+ oc on the whole set Z. Given any ¢  - 0 and any vc Z, the inequality 


h,(x) > c 


holds, starting from some value of 7. Therefore Z is covered by the countable 
collection of sets 


x: hy(x) > S (n = 1, pr | 


SEC. 2.3 GENERAL THEORY OF THE INTEGRAL 27 
and hence, on the set Z we certainly have 


sup thal) 


n 


> 1. 


At the same time, by (1), 


1(c*s) — =~ Ih, < €, 
C Cc 


1.e., Z 1s a set of measure zero by the definition given in Sec. 2.2. 

It is also apparent from the very definition of the class Lt that if f(x) 
belongs to Lt, then so does every function /,(x) which differs from f(x) only 
on a set of measure zero. Obviously, every function A(x) € H belongs to 
L*, and so does every function h,(x) which differs from h(x) only on a set of 
measure zero. In particular, every function differing from zero only on a set 
of measure zero belongs to the class Lt. 

Next we define the integral of a function f of the class L* by the formula 


If =\im [h,, (2) 
where hf, is the sequence of functions of the class H figuring in the definition 
of the function f. Since the sequence of numbers /h, is nondecreasing and 
bounded, the limit in the right-hand side of (2) certainly exists, but we 
must still show that it does not depend on the choice of the sequence h,, 
defining the function f. This will be shown after proving the following more 
general fact: If h,, and k, are two sequences of functions of the class H such 
that both sets Jh,, Ih,,... and Ik,, Ik,, ... are bounded, and if 


MWm7A Sr Kn 78 SfK<8 


almost everywhere, then 
lim [h,, < lim Ik,,. (3) 


m— co (momdee,0) 
To see this, we hold the index m fixed and consider the nonincreasing 
sequence 
h 


—k aT Zeon) 


m n 


of functions in H. This sequence has the limit 
Ri pe f= oe 0. 


But then (h,, — k,)" ‘0 (almost everywhere), and hence, by Axiom 3, 
(th,, —k,)+ “0. Since I(h,, —k,) < Wh, —k,)*, it follows that the 
integral I(h,, — k,) = Ih,, — Ik, is nonincreasing and has a nonpositive 
limit, which implies 

Ih,, < lim [k,,. 


ne 


28 GENERAL THEORY OF THE INTEGRAL CHAP. 2 
Since this inequality holds for arbitrary m, we can take the limit as m > o, 


obtaining the desired result (3). 
Setting g = f, we find that Jf < J/g, but because of the complete equivalence 


of f and g, we also have Ig < /f. It follows that Jf = Ig. Thus the definition 
(2) of the integral of a function fe L* is unique. Moreover, if fe Lt, g € Lt, 
f<g, then If < Ig. 


2.4. Properties of the Integral in the Class Lt 


Next, by the familiar process of passing to the limit, some (but not all) 
of the properties of integrals in the class H can be carried over to integrals of 
functions in the class Lt. In fact, it is easily verified that 

a) If fe Lt, geLt, thenf+ geLt, and 


f+ g) = If+ Ig. 
b) If fe Lt, then «fe Lt for every « > 0, and’ 
I(af) = elf. 


c) If fe Lt, ge L*, then min (f, g) € Lt, max (f, g) € L*. In particular, 
if fe Lt, then? 
f* = max (f, 0) € Lt. 


Next we show that the class Lt is closed under passage to the limit of 
nondecreasing sequences of functions with bounded integrals: 


THEOREM 1. Jf f, EE Lt n=1, 2,...),f, z f and If, < C, then 
felt and 
If = lim If,. 
Proof. For each function f, we construct the appropriate defining 
sequence of functions in 7: 


Ay < "< hy, < ees hy, 7A Ii 
Ry So TPS Nye 88? HM, 
Te Se acke Oe (ean a 


1 Note that L* is not closed under subtraction or multiplication by negative numbers, 
since we must always deal with nondecreasing sequences of functions 4, € H. 
2 The same is not true of the functions f- and | f|. 


SEC. 2.5 GENERAL THEORY OF THE INTEGRAL 29 


Then let h, = max (h,,,..., 4,,). Obviously, A, is also a function of 
the class H, and the sequence A, is nondecreasing. Moreover 


h, < max (fi,..-,fn) =\ns 
and hence Ih, < If, < C. Writing 


* = lim h,, 


we find, by the definition of the class L*, that f* e Lt and 
If* = him Ih, 


But since h,,, < h, < f, for any fixed k andn > k, passing to the limit 
n—» oo gives f, <f* < f. Since f, 7 f by hypothesis, it follows that 
f* =f (almost everywhere), and hence fe L*. Moreover Ih,,, < Th, < 
If, < If,’and since Ih, 7 If* = If, we have If, 7 If as well. This 
completes the proof. 


CoROLLARY. Let g, € L*, g, > 0 be such that 


(Sa) <¢ (ni = 152).02.)) 
k=1 


Then 
f=D& 
k=1 
belongs to Lt, and 
If => 1g, 
k=1 
Proof. We need only set 
Tn am > 4 
k=1 


and then apply Theorem 1. 


2.5. The Class L. Integration in L 


We now complete the construction of the integral by extending it from 
the class Lt to a wider class L = L(X), which is closed under all natural 
algebraic operations. By a summable (or Lebesgue-integrable) function we 
mean a function ¢(x) which can be represented on a set of full measure as 
the difference 

oe oat 


between two functions f and g of the class L+. The set of all summable 


30 GENERAL THEORY OF THE INTEGRAL CHAP. 2 


functions will be denoted by L. The following operations can then be carried 
out in L: 
a) Addition. If 9 =f—g and 9, =f, — g, are summable functions, 
so that f, g, f, and g, belong to L*, then 
et a=S+h) —(& + &) 
and hence » ++ ¢, is summable, since f+ f, e L*,¢ + 9,€L". 
b) Multiplication by an arbitrary real number «. If a > 0, then g = f — g, 
felt, geL* implies ag = af— ag, aofe Lt, age Lt, and hence 
ap € LZ. On the other hand, « < 0 implies —« > 0 and then ap = 


(—a)g — (—a)f implies ap € L, as before. Together, a) and b) show 
that any linear combination of summable functions is also summable. 


c) The operations ||, ot, 9. If p=f—g, felt, geL*, then 
max (f,g)¢ Lt, min(f, g) € Lt and hence 
|p| = max(f, g) — min(f, g) 
belongs to L. Since 


ot = Klol +9), 9 = gel — 9), 


we see that opt EL, p EL if ge L. Moreover, it follows from the 
formulas 


max (9, ¥) = (p — ¥)* + Y, 
min (9, $) = —max (—9, —4) 
that max (9, J) EL, min(y, peLifgeLl, VEL. 
To define the integral of a function 9 € L, suppose ¢ has the decomposi- 
tion 


o=f— Z, fel, geLt. (4) 


Io = If — Ig, 
and call / the Lebesgue integral of the function ¢(x), conventionally written 
as 


Then we set 


J 9) dx. 


It should be noted that the construction of the Lebesgue integral given here 
is due to Daniell,* and the construction originally given by Lebesgue in 
1902 (with which we shall become acquainted in Part 3) is based ona different 
approach. 


> P. J. Daniell, A general form of integral, Ann. of Math., 19, 279 (1917). See also L. H. 
Loomis, An Introduction to Abstract Harmonic Analysis, D. Van Nostrand Co., Inc., 
Princeton, N.J. (1953). 


SEC. 2.5 GENERAL THEORY OF THE INTEGRAL 31 


Next we verify that the number /¢ is uniquely defined. Suppose that 
besides (4), there is a second decomposition 


<4 =f, =< 40 Ii E Lx, £1 EL, 
Then we want to show that 


If — Ig = If, — Ig, (5) 


f+ Ig, =Ig + fy (6) 


or equivalently, that 


But since f+ g, = g + fi, we have 


because of the uniqueness of the integral in Z*, and this implies (6) and 
hence (5)., 

The integral just defined in the class L has the usual linearity properties. 
Let po —f— vg, 91 =f; — g;, where f, g, f, and g, belong to the class L™. 
Then 

e+ —(f+ fi) — & + &1)s 


and according to the definition, 


Ko +o) =Uft+f) —Keg + ed =F + Ii — lg — Ie, 
= Uf — Ig) + Wi — Ig) = 19 + I, 
i.e., the integral of a sum equals the sum of the integrals. Moreover, if 


a > O, then 
Kap) = af — ag) = af) — Mag) = alf — alg 


= a(lf — Ig) = alg. 
On the other hand, 


=) ae fe a ae; 
and hence, if « < 0, we have 
Kae) = (—lal 9) = —Klal @) = —lal Ie = ale. 


Therefore, regardless of its sign, the number « can be brought in front of 
the integral sign. 

Next we note that if op &€ L, 9 > 0, then /p > O. In fact, if p = f—g, 
felt, geLt and » > 0, then f> g and hence /f > Ig, or equivalently, 
Ip = If— Ig > 0. Similarly 9, EL, p, EL, Y, < P2 implies 


Ip, < 19, 


32 GENERAL THEORY OF THE INTEGRAL CHAP. 2 


and hence 
[Iq] < 7 (19) 
since +m <|q|. 


Remark. Other conditions can be imposed on the functions f and g 
figuring in the decomposition 9 = f—g, fe Lt, ge Lt of a summable 
function. For example, given any ¢ > 0, we can always choose g such that 
g > 0, Ig <«. To see this, consider a sequence A, 7 g of functions in H 
such that Ig = lim Jh,, and then write 


? =f—g=(- h,) —=(g = hy) = fn — Bn: 
Here f, belongs to Lt, since f, =f — h, = f+ (—A,) is a sum of two func- 
tions in Lt (the second function even belongs to /7), and the same is true of 
g,- It is obvious that for sufficiently large n, the function g, = g —h, 
satisfies the stipulated conditions g, > 0, /g, <«. In fact, if p > 0, the 
function f, = f—h, > f — g = ¢ also turns out to be nonnegative. 


2.6. Levi’s Theorem 


We now prove an important theorem on term-by-term integration of 
series with nonnegative terms: 


THEOREM 2 (Levi’s theorem). Let o,€L, 9, > 0 be such that 


k~1 
Then 
9 = 2 % 
k=1 
is a summable function, and 
Ip =) 19. 
k=1 


Proof. Using the remark at the end of Sec. 2.5, we represent each 
@, in the form 


O =Se— Be feel, BEL", 
where fi, > 0, 2, > 0, Ig, < 1/2" (k = 1,2,...). Then the functions g, 


satisfy all the conditions of the corollary to Theorem 1, since g, > 0 and 
i( > e} <1. 
k=1 


Therefore 


g=>& 


k=1 


SEC. 2.6 GENERAL THEORY OF THE INTEGRAL 33 


belongs to Lt, and 
Ig => Ig,. 
k=1 


Moreover, the functions f, also satisfy the conditions of the corollary. 
In fact, f, > 0 and 


(50) 1) +S) ce 


Therefore 
f=) jf, 
k=1 
also belongs to Lt, and 
If = > If, 
k=1 
It follows that 
, Y = do, => fe —Dd oe =Sf- 4 
k=1 k=1 k=1 


belongs to L, and 


Io =If-Ig => fL-—Dd ee =DIA.— a =DdIo 
k=1 k=1 k=1 k=1 
as asserted. 


CorotiaRy 1. [fp,eL(n=1,2,...), 9, 7 y and Ib, < C, then 
uW € Land 
Ty = lim Iy,. 


Proof. We need only set 


a=, Po=bde—V.--, Onst = VYnti— Vareees 


and then apply Levi’s theorem. Of course, a similar result also holds for 
a nonincreasing sequence ,, \ , provided that I), > C. 


It is clear that if a function o(x) € L is nonzero only on a set of measure 
zero, then Jp = 0. We now ask whether the converse is true, 1.e., does 
Ip = 0 imply that o(x) =0 almost everywhere? Naturally, it must now 
be assumed that ¢ does not change sign (e.g., is nonnegative), since otherwise 
J can vanish because of mutual cancellation of the integrals Jp*+ and Jo-. 
Thus, assuming that 9) € L, py > 0 and Ja, = 0, we now show that g, = 0 
almost everywhere. Let 7, — ng». Then the functions », converge to a limit 
@ equal to zero where , vanishes and to + 00 where 9, > 0. According to 
Corollary 1, the limit function ¢ must be summable. But then 9(x) = +00 
only on a set of measure zero, and hence (x) > 0 only on a set of measure 
zero. This proves 


CoroOLiary 2. If the integral of a nonnegative summable function 
Qo(x) is zero, then (x) = 0 almost everywhere. 


34 GENERAL THEORY OF THE INTEGRAL CHAP, 2 


COROLLARY 3. Given a set Z < X, suppose that for every ¢ > 0, 
there exists a sequence of summable functions 


Oo (x) ee ©, a) ee 
such that 
16, =e (DNs Qos) 
and 
sup oO '(x)>1  (x€Z). 


Then Z is a set of measure zero. 


Proof. if the p(x) are elementary functions, the corollary follows 
at once from the definition of a set of measure zero. In the general case, 
let (e) sean en(®) 
g (x) = lim ¢,(). 


Then, by Corollary 1, (x) is summable and 


ime ae 


nrn 


Ip 
Now choosing « = 1, 1/2,..., 1/n,..., let 
vy = (1), be — min {o), pier. sce Vn — min fol), Ae ius pitin)\ T 
The functions y, are nonnegative, and >1 on the set Z. Moreover 
W@ >> ¥,0) > 00 


and 


Th, 2 toe < i 


= 


If 

Y(x) = lim ¥,,(x), 
then, by Corollary 1, J) € ZL and — 

Ip =lhmIv, = 0. 


nO 


Clearly }(x) is nonnegative, and > 1 on the set Z. According to Corollary 
2, the set 
Z, = {x: (x) > O}, 


which obviously contains Z, is of measure zero. But then Z itself is of 
measure zero, as asserted. 


2.7. Lebesgue’s Theorem 


From now on, we shall consider arbitrary (nonmonotonic) passages to 
the limit. Classical examples show that we cannot expect theorems of the 
form “o, — 9 implies Ip, > Ip” to hold without further assumptions about 


SEC, 2.7 GENERAL THEORY OF THE INTEGRAL 35 


the way the sequence 9, converges to its limit. For example, consider the 
functions 


; Tv 
n sin nx for O<x<- 
n 


n(x) = fs 
0 LOL. SX < t, 
n 
Then the sequence 9, converges to zero for every x € [0, x], but at the same 
time, /p,, does not converge to J¢ (in fact, lp, = 2 for every n). 
Let L(g) denote the set of all summable functions g which satisfy 
(almost everywhere) the inequality 


—o < & < (7) 
where @p is a fixed nonnegative summable function. Obviously, 
for every function ¢ € L(g). Moreover, the limit ~ of a monotonic sequence 
of functidns in L(¢), whether decreasing or increasing, will clearly satisfy 
the inequality (7) on some set of full measure (just like the @, themselves), 
and as we have seen above, 9 1s also summable. In other words, the set 


L(@o) 1s closed under monotonic passages to the limit. It should also be noted 
that given any sequence ¢, € L(g»), we can assert that the functions 


sup {91(x), ae P(X), + (8) 


inf {p,(x),..., 9,(x),...} (9) 


also belong to L(9 ). In fact, (8) is the limit as n — oo of the nondecreasing 
sequence 


and 


max {¢,(x), ..., Pn(x)} € L(Go), 
while (9) is the limit of the nonincreasing sequence 
min {9(X), - -- 5 Px(x) € L(G). 

Now let 9,, € L(@9) be any sequence converging almost everywhere to a 
function y. Then ¢ also belongs to L(g). To see this, we need only show 
that » can be represented as the limit of a monotonic sequence of functions 
of the class L(,). As just shown, the functions 

v(x) — sup {9n(x); Pnti(X), 4 } 
Xn(x) — inf {p,(x), On+1(%), o } 
are summable and belong to L(g). By considering only values of x where 
the functions ».(x) converge to (x), we see that 
v(x) > lim Pn+-p(X) = 9(x), 
poo 


Xn(X) < lim Pntp(X) = 9(x) 


36 GENERAL THEORY OF THE INTEGRAL CHAP. 2 


for almost all x. Moreover, the least upper bound of the set 9,, ,11,--. Can 
only decrease if the first function 9, is omitted, while the greatest lower 
bound can only increase. Therefore 


Vnit(X) < vn), 
Xn+(X) > Axl), 


i.e., the sequence y,,(x) is nonincreasing and the sequence y,(x) is nonde- 
creasing. But then 9,(x) — 9(x) implies (x) S\ U(x) and y,(x) 7 x(x), and 
hence p(x) is the limit of a nondecreasing sequence of functions of the class 
L(o) {and at the same time, the limit of a nonincreasing sequence of functions 
of this class]. It follows that @ € L(@ 9), as asserted, and moreover 


Wn ZA Te, Wn S19, Tha < Tn < Mrs 
so that Jp, —> Ip. Thus we have proved 


THEOREM 3 (Lebesgue’s theorem).4 If a sequence of summable func- 
tions @,, converges almost everywhere to a function » and satisfies the 
condition 


lp <g(xeL (=1,2,...), (10) 
then @ is summable and 
Io = lim I¢,. (11) 


2.8. Summability of Almost-Everywhere Limits 


In some cases where the condition 9, — » does not imply Jo, — I, 
we can still draw conclusions about the summability of @ and deduce an 
estimate for J. 


2.8.1. Measurable functions. For example, if instead of the condition (10) 
figuring in Lebesgue’s theorem, it is assumed only that 9,,(x) € L, 9,(x) > 9() 
and 


lp(x)| < go(x) € L, (12) 
then g(x) is summable, but we no longer have the limit relation (11). In fact, 
(x) = lim ¢,,(x), 


where W,(x) is the function 9,(x) truncated from above at the level 9 (x) 
and from below at the leyel —¢,(x), 1.e., 


U(x) = max {min [o,(~), Po(x)], — 9o(x)}. 


‘Often called Lebesgue’s theorem on term-by-term integration, or Lebesgue’s bounded 
convergence theorem. 


SEC. 2.8 GENERAL THEORY OF THE INTEGRAL 37 


Obviously, |,(x) is summable and |¥Y,(x)| < 9 (x). Therefore, according to 
Lebesgue’s theorem, (x) is summable and we have the estimate 


[Tp] < Igy. 


In this connection, we introduce the following important concept: If a 
sequence of elementary functions converges almost everywhere to a function 
g(x), then p(x) is said to be measurable. For the time being, we point out 
only a few simple properties of measurable functions.® By their very construc- 
tion, all summable functions are measurable, but the class of summable 
functions is in general only a proper subset of the class of measurable 
functions. The inequality (12) gives a simple condition guaranteeing the 
summability of a measurable function g(x), i.e., every measurable function 
whose absolute value is bounded by a summable function is itself summable. 
To see this, we need only note that 

(x) = lim h,(x), 
) nro 
where the functions A,(x) are elementary and hence summable, and then 
apply the result just proved. 


2.8.2. Fatou’s lemma. We can replace the condition (10) figuring in 
Lebesgue’s theorem by the weaker condition 


I(lenl) < C. 
Then the limit function 
o= lim 9, 


is again summable, but the limit relation (11) is replaced by the estimate 
(lel) < C. 
We begin by proving this for the case where the functions 9, are nonnegative: 


LeMMA (Fatou’s lemma). Let 9, €L,9, > 9 be such that 9,—- 9 

almost everywhere and Ip, < C. Then is summable, and 
0< p< C. 
Proof. If 
Xn = inf {Pns Pn+l>*° J > 0, 
then, as before, the functions y,, form a nondecreasing sequence con- 
verging almost everywhere to ». Moreover x, < ,, 1x, < 1, < C, 
and hence by Corollary I, p. 33, the function g is summable and 
Ty,, 7 Ip. In particular, 
0< lo =Ilimly, < C, 


nm 


as asserted. 


5 Part 3 will be largely devoted to a study of measurable functions. 


38 GENERAL THEORY OF THE INTEGRAL CHAP. 2 


Returning to the general case ¢,,(x) > 9(x), [(|,]) < C, we note that by 
Fatou’s lemma, |(x)| is summable and [(|9|) < C. But then, by the result 
of Sec. 2.8.1, (x) itself is summable, as required. 


2.9. Completeness of the Space L. The Riesz-Fischer Theorem 


We begin by recalling the definition of a normed linear space. A linear 
space R consisting of elements 9, y,... 1s said to be normed if with every 
element 9 € R there is associated a nonnegative number ||¢||, called the 
norm of o, which has the following properties: 


a) |lp|| > Oif p 40, and ||O] = 0. 
b) |lap|| = |«| Ile] for every @ € R and every real number a. 
c) The triangle inequality. |g + Y\l < |lell + bi forevery pe R, VER. 


Given a sequence of elements 9, € R, we say that ¢ is the limit of o,, if 


lim |p — || = 9. 


1; Bemane @) 


A sequence ¢, 1s said to be a Cauchy (or fundamental) sequence if 


lim Pan ~ Pall = 0, 
me, 00 

i.e., If given any < > O, there exists an integer N such that |ip,, — 9, || < 
whenever m > N, n > N. A normed linear space R is said to be complete 
if it satisfes the Cauchy criterion, 1.e., if every Cauchy sequence 9, € R hasa 
limit @ € R (relative to the norm || ||). 

Next we construct a normed linear space of summable functions 9, by 
equipping L (which is already a normed linear space) with the norm 


lel] = ZI). (13) 


The fact that (13) satisfies Properties b and c follows at once from the basic 
properties of the integral. Strictly speaking, this choice of norm does not 
satisfy Property a, since /(|~|) = 0 does not imply that : vanishes identically. 
However, this discrepancy is easily removed by the simple device of identi- 
fying all elements of L which differ only on a set of measure zero,® since, 
according to Corollary 2. p. 33, /(|e{|) — 0 implies that » = 0 almost every- 
where. 

The effort we have expended in constructing the space L is to a large 
extent justified by the following important result: 


* More precisely, our normed space is not L but another space whose elements are 
classes of functions of Z differing only on sets of measure zero. For example, the zero 
element of this new space is the class consisting of all functions which are zero almost 
everywhere. We shall use the same symbol L to denote both spaces. 


SEC. 2.9 GENERAL THEORY OF THE INTEGRAL 39 


THEOREM 4 (Riesz-Fischer theorem). The space L, equipped with 
the norm (13), is complete, i.e., every Cauchy sequence ¢,, of summable 
functions has a summable limit (in the L-norm). 


Proof. It is enough to show that some subsequence 9,, of the 
Cauchy sequence 9, has a limit @ € L, since then ¢ will also be the 
limit of the whole sequence ¢,. This follows from the inequality 

lo — all < Ile — Pall + Mea, — Pall 
and the fact that the second term on the right goes to zero as n> © 


and m,— © (since 9, is a Cauchy sequence). Clearly, we can always 
find an increasing sequence of indices n, such that 


1 
lpn — Pa, aay: (S12 oe) 


for n > n,. In particular, 


1] 
id | Prisa = On, || << ak ’ 
which means that 


1 
T(l9n,44 —_ Qn,|) < ak ‘ 
But then, according to Levi’s theorem, the series of summable functions 


20 
> I Press a Pn,l 
k=1 
converges almost everywhere, and hence the same is.true of the series 
@ 
2 (Pm. =, Pn,)s 
with partial sums a 
N 
2. (Pren = P nx) = Pnyi1 — Pry, 


This means that the sequence ¢, has a limit (almost everywhere) as 
k —» oo. Let » denote this limit. Then, for fixed k, the function 9, — 9, 
approaches g — 9, almost everywhere as p — ©. Since 


k 


1 
I(lGn, — 9n,l) = lez, — 9n,ll < a (p > k), 


it follows from the result of Sec. 2.8.2 that ¢ — ¢,, is summable, and 
hence ¢ itself is summable. Moreover, by the same result, 


1 
lo — n,|1 = I(lp — n,|) < om 


Therefore 9, converges to ¢ in the norm of the space L, and the proof 
is complete. 


40 GENERAL THEORY OF THE INTEGRAL CHAP. 2 


Finally, we prove that the set H of all elementary functions is dense in 
the space L. Since every function in L is the difference of two functions 
in L*+, we need only verify that every function fe L* is the limit (in the norm 
of L) of a sequence of functions 4, € H. The natural choice for this sequence 
is the sequence defining f. Then h, 7 f, Ih, 7 If and 


If — All = Uf — h,) = If — th, > 9, 
as required. 


2.10. Fubini’s Theorem 


Next we consider integration over the product of two sets, deriving the 
formula for reduction of a double integral to an iterated integral. It will 
be recalled from calculus that the double Riemann integral of a continuous 
function f(x, y) can be expressed in terms of two single Riemann integrals 
by using the rule 


[J fe, yy ax dy =[1[ 4G, ») ax] ay, 


Q1S2718)1 
aes xe be 


As we now show, there is a similar rule in the general theory of integration: 


THEOREM 5 (Fubini’s theorem). Given two sets X and Y, with Car- 
tesian product W = X x Y,’ let L(X), L(Y) and L(W) be corresponding 
spaces of summable functions, equipped with integrals ly, ly and ly. = I. 
Suppose the family H(W) of elementary functions generating L(W) has 
the following properties :® 

a) Every function h(x, y) € H(W) is summable in x for almost all y; 

b) The integral Ixh(x, y) is summable in y; 

) th = Ty {Ixh(x, y)}. 


Then L(W) has the same properties, i.e., every function (x, y) € L(W) 
is summable in x for almost all y, the integral Iy¢(x, y) is summable in y, 
and 


Ip = Ty {Ix 9(x, y)}- 


Proof. Let ® denote the set of all functions ¢ €¢ L(W) for which 
the theorem holds. By hypothesis, ® contains all the elementary functions 
h(x, y). The theorem will be proved once we succeed in showing that 
every function in L(W) belongs to ®. This will be established in five 
steps. 


7 By X x Y is meant the set of all ordered pairs (x, y), where x€ X and ye Y. 
® In addition to the usual properties of a family of elementary functions. 


SEC. 2.10 GENERAL THEORY OF THE INTEGRAL 4] 


Step 1. Obviously, ® is closed under the formation of linear combina- 
tions, 1.e., if 9, ED, p,EM, then a,9, + «.9,e€@, where a, a, are 
arbitrary real numbers. 

Step 2. ® is closed under monotonic passages to the limit. More exactly, 
let @,(x, y), Po(x, y),... be a sequence of functions in ® which is 
(everywhere) monotonic, and suppose the integrals /p,, form a bounded 
(numerical) sequence. Then 


o(x, y) = lim 9,(x, y) 


belongs to ®. For example, suppose the sequence ¢,(x, y) is nonde- 
creasing, and let g,(y) = /y¢,(x, y). Then the sequence g,(y) is also 
nondecreasing, and the integrals /,g, form a bounded sequence: 


Tyan = Ty{Ix¢,(x, y)} = 19, 7 Io. 
By Corollary | to Levi’s theorem, g,,(y) converges to a summable function 
g(y), which must be finite almost everywhere, and moreover 
lye == lim Ly 2n = Ig. 
Let E < Y be the set of full measure on which the function g(y) 1s finite, 
and let y be a (temporarily) fixed point of £. Then the sequence 9,,(x, y) 
is nondecreasing and the integrals J, o,(x, y) form a bounded sequence: 


Tx9,(x, y) = 8,(¥) 7 gy). 


Therefore, applying Corollary 1 again, we find that the limit function 
o(x, y) is summable in x (for the given value of y), and 
lim Ix ,(x, Y) = 8(y) = Ix, Y). 
But then 
Ip = Iyg(y) = Ty U xo, y)}, 


and hence (x, y) € ®, as asserted. 


Step 3. ® contains every function 2z(x, y) which is different from zero 
only on a set Z <— W of measure zero. First suppose the values taken 
by z(x, vy) on Z lie between 0 and |. Since Z is a set of measure zero, 
given any m = 1,2,..., we can construct a nondecreasing sequence of 
nonnegative elementary functions A‘”(x, y) such that 


Thi (x, y) < aS lim h(x, y) > 1 on Z. 
m n— 00 


Moreover, it can always be assumed that 


Wr(x, y) < hyn(x, y), 


42 GENERAL THEORY OF THE INTEGRAL CHAP. 2 


since otherwise we can replace A”) by min {h™™ (x, y), ALM (x, y)}. 
The function 
h(x, y) = lim hy” (x, y) 


is a monotonic limit of (elementary) functions of ®, and hence belongs 
to ®, by Step 2. For the same reason, the function 


h(x, y) = lim h'"""(x, y) 


me x 


also belongs to ®, and clearly 


Th™ =lim Ih < = 
No mM 
Th = lim [h'™ = 0. 


Moreover A(x, 3) > z(x,y) on Z, since A’ (x, y) -» z(x, y) on Z for all 
m= 1,2,...1f g(y) = Lxh(x, y), then 


by Step 2, and hence g()') - 0 for almost all y (by Corollary 2 to Levi's 
theorem). But then, for these values of y, the function A(x, y) vanishes 
for almost all x, and hence the same ts true of the function z(x, y). It 
follows that I,z(x, y) = 0, and hence 


Iz(x, y) = 0 = Jy {1 xz(x, y)}, 


i.e., (x, y) € ®, as asserted. 

If z(x, y) > 0 is an arbitrary function vanishing outside the set Z, 
then, introducing the function /(x, y) equal to 1 on Z and 0 outside Z, 
we have 


2(x, y) = lim a min {I(x, y), + 2(x, y)}. 
n 


n> oO 


Therefore z(x, y) € ®, by the argument just given, together with Step 2. 
The general case, where =(x, y) can have either sign, reduces to the case 
just considered by writing z = zt — z-. 

Step 4. ® contains every function f(x, y) € L'(W). By the definition 
of L'(W), there is a sequence of elementary functions /,(x, y) such that 


A(x, Y) 7 fy), Tha 7 Wf, 


where the sequence (x, y) need only be nondecreasing almost every- 
where. Let f(x, y) be the limit of the sequence 


hy(x, y) = A(x, y), hex, y) = max {h,(x, y), holx, Y)},- 


SEC. 2.10 GENERAL THEORY OF THE INTEGRAL 43 


which is nondecreasing everywhere. Note that the functions h, and h, 


coincide almost everywhere, and hence Ih, = /h,,. The function fx, y) 
coincides with f(x, y) almost everywhere, and hence can be written in 
the form 


Kx, y) = flx, y) + 2(x, y), 


where z(x, y) is nonzero only on a set of measure zero. But by Steps 2 
and 3, both f and z belong to ®. Therefore, by Step |, f also belongs 
to ®, as asserted. 


Step 5. D contains every function » € L(W), since every such @ is the 
difference between two functions of the class L*(W). The proof of 
Fubini’s theorem is now complete. 


Remark J. It is natural to ask whether the converse of Fubini’s theorem 
holds, i.e., does the existence of the iterated integral 


, Ty xox, y)} (14) 


imply that p(x, y) is summable on the set W? In general, o(x, y) will not be 
summable on W (see Probs. 7 and 8, p. 57). However, if p(x, y) is measurable® 
and nonnegative, then the existence of (14) does in fact imply the summability 
of ~(x, y) on W, together with the relation 


Ip = Iy{1x9(x, y)}. (15) 
To see this, suppose the iterated integral 

Iy{lx 9, y)} = A 
exists, where 

o(x, y) = lim h,(x, y). 
Then the function ae 

n(x, y) = min {e(x, y), max (Ay, ...,h)} 

is also measurable, since 


o,(x, y) lim min f{h,,, max (A, ...,h,)}, 


and moreover 9,(x, y) is bounded by the summable function max (/,,...,/,,). 
Therefore ¢,(x, y) is summable on W, by the result of Sec. 2.8.1, and 


Ion = Ty {Ix9,(x, y)} < A, 


by Fubini’s theorem. Since 9, 7 ¢ and /9, < A, it follows from Corollary |! 
to Levi’s theorem that the function g is summable. But then, applying 
Fubini’s theorem again, we obtain (15), as required. 


*T.e., if p(x, y) is the almost-every where limit of a sequence of elementary functions, as 
in Sec. 2.8.1. 


44 GENERAL THEORY OF THE INTEGRAL CHAP. 2 


Remark 2. Here we have started from a “ready-made” integral on the 
Cartesian product W of the sets X and Y, related in a certain way to the 
integrals defined on X and Y themselves. In Sec. 6.8, we shall see how to 
construct the integral on W, starting from known integrals on X and Y 
satisfying the necessary constraints. 


2.11. Integrals of Variable Sign 


So far we have required that the elementary integral be nonnegative, 
and this fact has played a key role in our considerations. However, in 
analysis one also encounters the case where the continuous linear functional 
Th,’° which we wish to call the integral, can take values of either sign (e.g., 
the Stieltjes integral, the subject of a special study in Part 2). 


2.11.1. Riesz’s representationtheorem. The structure of continuous linear 
functionals of variable sign is revealed by the following 


THEOREM 6 (Riesz’s representation theorem). Every continuous linear 
functional Lh of variable sign,'' defined on the space of elementary functions 
heH, can be represented as the difference between two nonnegative 
functionals on H. 


Proof. Denoting the set of all nonnegative 4 € H by H* (note that 
H* is not a linear space), we define the functional 
Jh= sup Ik (16) 
O-<k( 2) <h(x) 
on H~. If the integral J were nonnegative as before, then obviously we 
would have Jh = /h, but this can no longer be asserted. In any event, 
it is obvious that Jh > 0 and Jh = Jh, since we can always choose 
k = Oork =h. The possibility that Jh = Ofor all € H+ is not excluded, 
nor do we exclude a priori the case where Jh takes the value -+-oo for 
certain 4 € H~ (however, see Step 2 below). The nub of the proof is to 
show that the (nonnegative) functional J is linear and continuous on H*, 
and can be extended to the larger space H. This will be established in 
five steps. 


Step 1. J is subadditive on H*, 1.e., 
J(hy + hy) < Shy + Shy (17) 
for every h,,h,€ H'. In fact, suppose 0 < kg < Ay + hy. Then ky can 


always be represented in the form k, + kg, where ky < hy, kg < hg, 
since we need only set k, = min (Ay, ko), kp = ky — ky. Butky = k, + ky 


10 Here linearity and continuity are defined as in Axioms 1 and 3, p. 24. 
11 Which are, of course, themselves linear and continuous. 


SEC. 2.11 GENERAL THEORY OF THE INTEGRAL 45 


implies [ky = Ik, + Ik, < Jh, + Jh,, and then (17) follows at once, 
after taking the least upper bound of the left-hand side. 


Step 2. J is finite on H*. Suppose, to the contrary, that Jhy = +00 
for some fy € H*. Then, as we now show, there exists a sequence of 
functions h, € H+ such that 


h < $h,15 Sh, = +00, |Zh,,| 2 Nn, (18) 


which contradicts the assumed continuity of the functional J, since 
h,{x) \ 0. To start the induction, let A, be the first function. Assuming 
that functions fp, hy, ..., h,_, satisfying the conditions (18) have already 
been constructed, we choose a function k such that 


O0<k<h,_,, Ik > |Ih,_,| + 2n 


(here we use the fact that Jh,_, = oo). If Jk = o, we seth, = $k, while 
if Jk < oo, then certainly 
ha —kK)= 0, Wk —h,4)| > Tk — Ih, 4] > 


which allows us to seth, = $(A,_, — k). Thus a sequence iii (18) 
actually exists, and hence the assumption that Jhg = +00 leads to a 
contradiction, i.e., JA is finite for all h € H*, as asserted. 


Step 3. J is additive and positive homogeneous on H*. First we prove 
that J is additive on H™, 1.e., that 


for every h,,h,€ H*. Given any ¢ > 0, we choose functions k, < hy 
and k, < h, such that 


Ik, > Jh, = 8, Ik, > > Jh, amas, 
Then 


and hence 
Jh, + Shy < hy + hy), (20) 


since < > O is arbitrary. But together (20) and (17) imply (19). Moreover, 
it is obvious from the definition (16) that J is positive homogeneous 
on H™, i.e., that 

J(ah)=ash (he Ht,a> 0). 


Step 4. J can be extended onto H. \f 
og =h, — hy, 
where 9 € H and fy, h, € Ht, we define 
Jo = Sh, — Shy. 


46 GENERAL THEORY OF THE INTEGRAL CHAP. 2 


This definition is unique, since if @ = h, — h, = hg — hy, where hg, hy 
also belong to H*, then 
hy + hy = hy + hy, 
Jh, sn Jh, —= J(h, + hy) = Ji(h, + hs) — Jh, + Jhs, 


and hence 
Sh, = Sh, = Jhy — Shy. 


The functional J is still additive on H, since 
Qi =h, — ky, Po = hy — ke (hy, ky, he, k, € H*) 
implies 
Pi + Po = (Ay + hy) — (ky + ky), 
J(P1 + G2) = JCA, + Me) — Jk, + ke) 

= Jh, -+- Jhy Sa Jk, — Jk. oa JQ + JQo. 
Moreover, for any real «, we have J(ap) = aJ(¢). This is obvious for 
a > 0, and hence we need only consider the case x = —1. Bue = h—k 
(h, k € H*) implies —— = k — fh, so that indeed 

J(—9) = J(k) — Hh) = —[(h) — (KD) = —Jo¢. 
The fact that J is linear on H now follows at once. 
Step 5. J is continuous on H, 1.e., 
h, 0 implies Jh,—0. 


Given any « > 0, we choose numbers ¢,, > O such that 


and functions k,,0 < k, < h,, such that 
Ik,, > Sh, — En: 
Moreover, let 
k,, = min(k,,...,k,). 
Then it is claimed that 
Jh, < 1k, + de, (21) 
i=1 
For n = 1, this follows from the definition of the function k, = k,. 
Suppose (21) holds for the values i= I, ...,”. Obviously 
Katt = min (k,, Kn+1)s 
max (kas Kr ss min (Kans Kn41) = kn 1 Kins 
and hence 


J(max (Kas Kn41)) a Tks 7 Ik, zi Tk nn > Ik, 2 Thar — ©n+1- (22) 


SEC. 2.11 GENERAL THEORY OF THE INTEGRAL 47 


On the other hand, observing that 
kn, < kK, < h,, Kins < Ansa < h,, 


and assuming that (21) holds, we have 


n 


I (max (ky, kny1)) < J(h,) < Ik, +2 & (23) 
t=1 
It follows from (22) and (23) that 
Ik, = This — Eni. — Kyat < Ik, ae Dd es 
i=l 


and hence 
n+l 


Ihngy < ays + De 
i=l 


which completes the induction. At the same time, we obtain the relation 


lim Jh, < limIk, +e =e, 
since k, < h, \\ 0 and hence Jk, — 0. But then Jh, — 0, as asserted, 
since ¢ > 0 is arbitrary. 


The rest of the proof is now straightforward. Consider the functional 
N=J—-], 


defined on H. If h > 0, then NA = Jh -- Ih > O, so that N, like J, is a 
nonnegative functional. Since J and / are linear and continuous, so is 
N,'* and the theorem is proved. 


2.11.2. Construction of a space of summable functions for the functional /. 
Next we use the functionals J and N figuring in Theorem 6 to extend the 
domain of definition of the functional /. Actually, we start from a single 
nonnegative linear functional K = J + N, and follow the procedure described 
in Sec. 2.5. First we distinguish sets of K-measure zero. As on p. 24, a set 
Z < X is said to be of K-measure zero if, given any e > 0, there exists a 
nondecreasing sequence of nonnegative functions A,(x)¢H such that 
Kh, < ce and sup h,(x) > 1 on Z; every set of K-measure zero is automatically 
a set of J-measure zero and of N-measure zero, since 0 < Jh, < Kh,, 
0 < Nh, < Kh,. Then we define the class L} consisting of functions f(x) 
which are the limits of nondecreasing sequences /, € H such that Kh, < C 
(n = 1,2,...). Obviously, for such functions it makes sense to talk about 
Jf and Nf, in addition to 

Kf = lim Kh,,. 


hr? oO 


12 In particular, 4, \ 0 implies Nh, = Jh, — Ih, +0 (actually NA, \ 0, since N is 
nonnegative). 


48 GENERAL THEORY OF THE INTEGRAL CHAP. 2 


Finally we form the class Ly from differences 


. . ae 
epuf-—e Ugely). 

If sas such a function, the expressions Ag Af Ag, Jo and No are all 

meaningful, and we can define the integral 


lo = Jo — Ne, 


thereby extending the continuous linear functional / to the space L)., where 
it satisties the Inequality 
Be heh uN ghee Ie NS <e NS Ko’ - Ag = Alo). 


2.11.3. Other representations of /. The canonical representation. In 
Theetem o, we found a representation of the functional 7 of variable sign 
as a difference 

I=J—N (24) 


between two nonnegative functionals. This representation is not unique. 
In tact let 2 be any nennegative continuous linear funetional on H. Then, 
besides the representation (24), we can also write 


P=(J+ L)—(N+ ZL). (25) 


It turns out that (25) is actually. the most general representation of J as a 
difference between MoO nonnegative tunctionals. To see this, suppose we 
have any representation 

| J, — N;,, (26) 
Where J, and Vy are nonnegative continuous linear functionals. Then, given 
AV nonnegative functions zy Ac Ao such that dQ s. ACY) ~ ACY). 


and henee 
Jh=- sup Ik 


Oya bord 


Ih. 


I 


Therefore J, Jd where € 0 J, J is a nennegative continuous linear 
functional, Moreover 


VN =4,-I-J+L—-IANG+L, 


i.e. We have reduced (26) to the form (25), as required. 

At the same time, we find that the representation (24), explicitly con- 
structed in) Theorem oO, has a simple characterization in the class of all 
possible representations of the form (20), be. the functionals J and V figuring 
in (lf are the sencless posse among all that can figure in (26). For this 
reason, (24) will be called the cavroncad represemarion oat the functional 7, 

Pinally, we shew that the space 2,. A J - corresponding to the 
canomeal representation is the largest: possible, in’ the sense that: every 


SEC. 2.11 GENERAL THEORY OF THE INTEGRAL 49 


function pe Ly, Ky = J, + N, also belongs to Lx. Clearly, we need only 
verify that every function fe Lx also belongs to Li. In fact, after adding 
a suitable elementary function (if necessary), f is the limit, except on a set 
of K,-measure zero, of a sequence of nonnegative elementary functions 
h,(x) with bounded integrals 7, h,. But obviously [xh, < I, A, for non- 
negative elementary functions, and hence the integrals JA, are also bounded. 
Moreover, the set of K,-measure zero on which the sequence h, fails to 
converge to f is also a set of K-measure zero, since h > 0, I, A < € implies 
Ih < e. Therefore the sequence A, converges to f everywhere except on a 
set of K-measure zero, i.e., fe LX, as required. 


3 


THE LEBESGUE INTEGRAL 
IN n-SPACE 


In this chapter we shall use the general scheme of Chap. 2 to construct 
the Lebesgue integral for a finite-dimensional space, choosing as the ele- 
mentary functions first step functions and then continuous functions. 


3.1. Relation between the Riemann Integra! 
and the Lebesgue Integral 


It will be recalled from Sec. 1.6 that if f(x) is Riemann integrable (over 
the basic block B), then f(x) is the almost-everywhere limit of a nondecreasing 
sequence of (lower) step functions. Suppose we choose the family H of 
elementary functions to be the family of step functions A(x), with the ‘‘natural”’ 
definition of the integral, i.e., 


Th=Sh(B), By = {x: h(x) = hy}. (1) 


Then, as shown in Chap. 1, / satisfies all the axioms for an elementary 
integral given in Sec. 2.1. Therefore the entire scheme of Chap. 2 1s applicable 
to the present case, and impiies the existence of a linear space L(B) of func- 
tions summable on the block B. Moreover, there is a Lebesgue integral 7p 
defined on L(B), and L(B) is complete when equipped with the norm 
oll = A(¢). 

We now try to form some idea (albeit partial) of the size of the class 
L(B). It follows from the considerations of Chap. 1 that every Riemann- 
integrable function f (in particular, every continuous function) is also 


50 


SEC. 3.2 THE LEBESGUE INTEGRAL IN 7-SPACE 5l 


Lebesgue-integrable (in fact, fe Lt), and moreover the Lebesgue integral 
of f coincides with the Riemann integral of f; Thus the process of Lebesgue 
integration applies to every Riemann-integrable function. But it also applies 
to a much larger class of functions. For example, a function with no con- 
tinuity points at all can still be Lebesgue integrable. Thus the Dirichlet 
function y(x), defined in footnote 2, p. 8, although not Riemann integrable, 
is Lebesgue integrable (being nonzero only on a set of measure zero), and in 
fact Jy = 0. There are more complicated examples of Lebesgue-integrable 
functions which have no continuity points even after an arbitrary alteration 
on a set of measure zero (see Prob. 4, p. 148). 

It follows from Sec. 2.8.1 that every bounded measurable function is 
summable, since in the present case, constants are summable functions. This 
immediately raises the question of whether there are bounded nonsummable 
functions. The answer is in the affirmative (see Prob. 6, p. 57), although 
no explicit example of such a function has yet been constructed. 


4 


3.2. Improper Riemann Integrals and the Lebesgue Integral 


Next we consider functions which have improper Riemann integrals. 
First suppose ¢(x) is nonnegative and bounded in the block B = {x: |x,| < a,, 


j=1,...,n}, everywhere except at the origin of coordinates, where ¢(x) 
becomes infinite, and suppose the (ordinary) Riemann integral 
Jon, 9) 4x (2) 


exists for every block of the form B,={x:|x|<e, j=1,...,2}, where (2) is 
defined in the obvious way by partitioning B— B, (B, CB) into subblocks 
with no interior points in common. Then g(x) is said to be Riemann 
integrable on B if the integral (2) approaches a limit as e—0, and this limit, 


denoted by f a 
(x) dx, 
B 


is called the (improper) Riemann integral of (x). For example, the function 


n —a/2 
j=1 


has an improper Riemann integral over any block containing the origin if 
a <n, but not if « > n. Let us analyze this situation from the standpoint 
of the Lebesgue integral. The integral (2) is the integral over the whole 
block B of th ti 

ock B of the function GG): ten KER: 


(x) = 
oe) 0 for xeB.,, 


1T.e., every bounded function which is the limit almost everywhere of a sequence of 
step functions (see p. 37). 


§2 THE LEBESGUE INTEGRAL IN 7-SPACE CHAP. 3 


whose Riemann and Lebesgue integrals coincide. As «0, the functions 
o,(x) form a nondecreasing sequence converging to 9(x). Therefore, if the 
integral (2) approaches a limit as ¢ — 0, then, by Levi’s theorem, the function 
© is summable, with Lebesgue integral equal to the limit of (2) as « — 0, 
i.e., to the improper Riemann integral of ~. Conversely, if (2) approaches 
infinity as ¢ — 0, 1.e., if ¢ has no improper Riemann integral, then 9 cannot 
have a Lebesgue integral /¢, since the existence of Jp would imply Je, < I¢ 
for all c. In other words, ¢ is summable on Bif and only if @ has an improper 
Riemann integral on B. 

The case of an unbounded domain of integration (rather than an 
unbounded integrand) is handled in much the same way. For example, let 
R,, denote all of Euclidean n-space, and, given a nonnegative function ¢(x), 
suppose the (ordinary) Riemann integral 


f,, 902) dx (3) 
exists for every block of the form B, = {x:|x,| <r,j=1,...,n}. Then 
@(x) 1s said to be Riemann integrable on R, if the integral (3) approaches 
a limit as r — oo, and this limit, denoted by 


} p,, PAX) ax, 


is called the (improper) Riemann integral of ¢(x). In terms of Lebesgue 
integrals, (3) is the integral over R,, of the function 


a for xeEB, 


P(x) = 
for xER, — B,, 


whose Riemann and Lebesgue integrals coincide. Here, to construct the 
Lebesgue integral, we choose as our elementary functions all step functions 
vanishing outside finite unions of (bounded) blocks. Thus 9 is summable 
on R,, if and only if ¢ has an improper Riemann integral on R,. The argument 
is the same as for the case where ¢ has a singular point, except that now B, 
and 9, play the roles of B — B, and 9,. 

To recapitulate, the class of summable functions contains all nonnegative 
functions @ with improper Riemann integrals. \t is essential that @ be non- 
negative, since otherwise this assertion breaks down (see Probs. 4 and 5, 


p. 34). 


3.3. Fubini’s Theorem for Functions of Several Real Variables 


We now examine the meaning of Fubini’s theorem when applied to 
functions of several real variables. In the notation of Sec. 2.10, let X be the 
block 

By = {x:a,< %, < by... ay < Xm < 5,,} 


SEC. 3.3 THE LEBESGUE INTEGRAL IN m-SPACE 53 


in m-space, and let Y be the block 
By = (yil1 < Va < dye On < Yn < Gy} 
in n-space. Then W = X X Y is the block 
B46,.)) a= 0; = 0p t= ps def SH hess it he = lyasay tt} 


in (m + n)-space. For the space H(W) we choose all step functions A(x, y) 
defined on the block B, i.e., all functions of the form 


h(x, y) = DayePOw?() 
I=1 


where the blocks BY) and Bi) form partitions of By and By, respectively. 
The function x,°(x) is the characteristic function of BY), 1.e., 


1 for xe BY, 
rd (3) x) — 
Bes 0 for xeEB, — BY, 


and similarly for y,'(y). For the elementary integral on H(W), we make 
the ‘natural choice” 


Ih = >«,s(BY)s(BY’). 
j=l 
The space H(W), equipped with this integral, clearly satisfies all the hypotheses 
of Fubini’s theorem, i.e., every function h(x, y) © H(W) is summable in x 


for almost all y [being a step function in x except for finitely many sheets 
of discontinuity of the x,(y)], the integral 


Iyh = Yaj;s(By X29) 
j=1 


is summable in y (being a step function in y), and 


Th = Ya,s(BYy)s(By’) = Ty (Ixh(x, y)}. 
j=1 


It follows that the space L(W) generated by H(W) and / has the same three 
properties. In particular, 
Ip = Iy{Ix9(x, y)} 


for every 9(x, y) € L(W). Moreover, we can also write 
Ip = Ix{ly 9% y)}; 


because of the symmetry between the roles of x and y in the definition of 
the elementary intcgral. 


54 THE LEBESGUE INTEGRAL IN W-SPACE CHAP. 3 


3.4. Continuous Functions as Elementary Functions, 
with the Riemann Integral as Elementary Integral 


We now describe another way of constructing the space L of Lebesgue- 
integrable functions. Suppose we choose as our elementary functions the 
set A of all continuous functions f(x) in the (closed bounded) block B, with 
the Riemann integral as elementary integral. Then H satisfies Axioms a, b 
and the proposed elementary (Riemann) integral, henceforth denoted by 
If, satisfies Axioms | 3 (see p. 24). The only nontrivial part of this assertion 
is the verification of Axiom 3. But Axiom 3 ts an immediate consequence 
of the estimate 


hl = || faC29 dx| < s(B) max | f(2) 
and the following | 


LEMMA (Dini’s lenuna). A nonincreasing sequence of nonnegative 
continuous functions f,,(X) converging to zero at every point of a closed 
bounded block B converges to zero uniformly in B. 


Proof. Givenany< > Oand any point x) ¢ B. we can find an integer 
mo. miXy) such that /,,(%9) ~ ¢. Then we find a neighborhood U(x) 
such that f,,(v) ~ ¢ for all x € U(vy). Obviously. if pp - m, then f,(v) < 
falX) ~ ¢ for all ve U(xy). Constructing such a neighborhood for every 
point of B, we obtain a covering of B, from which we can select a finite 
subcovering (cf. p. 13). Let g be the smallest subscript of the functions 
participating in this subcovering. Then /,(v) ~ ¢ for every v € B provided 
that r > q, and the lemma is proved. 


Thus all the prerequisites for constructing a theory of the integral, based 
on continuous functions as elementary functions, with the Riemann integral / 
as elementary integral, are satisfied. Let L denote the corresponding space 
of summable functions, equipped with a Lebesgue integral /f, Then. as we 
now show, this construction of Lebesgue-integrable functions agrees with 
that of Sec. 3.1, based on step functions as elementary functions (with the 
obvious definition of elementary integral), leading to a space L of summable 
functions equipped with a Lebesgue integral Jf. More precisely, we prove 
the following 


THEOREM. The pvo constructions of the Lebesgue integral in n-space 
are equivalent, t.e., L = L and If = If. 
Proof. The proof will be established in four steps: 


Step 1. Every continuous function f(x) belongs to L. Given any = > 0, 
we can find a partition I} = (By... .. B,,} of the basic block B so fine 
that 


e 


[, fd dx —¥ f&(B)]<e (EB) 


SEC, 3.4 THE LEBESGUE INTEGRAL IN n-SPACE 55 


or equivalently, 


Lf f(x) de — thats) | << % 


where /,(x) is the step function equal to /(¢,) in the block B,. Since 
h(x) converges uniformly to f(x) as the partition IJ is refined indefinitely, 
i.e., as d(IL) > 0, it follows from Lebesgue’s theorem (see p. 36) that 
f¢Land 

If = lim Ihy = If, 


a(fl) +0 


where the last equality is implied by (4). 


Step 2. Every step function h(x) belongs to L. Every function equal 
to 1 in a block B and to 0 outside B, and hence every step function 4, 
can be represented (in various ways) as the limit of a bounded everywhere 
convergent sequence of continuous functions f,,,(x),? where, as Just shown, 
If, — Ifm. Therefore, again by Lebesgue’s theorem, h € © and 
4 


ih = lim If, = lim If,, = Ih. 


m7 D Mme, 


Step 3. Both constructions lead to the same sets of measure zero. Let Z 
be a set of measure zero relative to the integral /. Then, given any < > 0, 
there exists a nondecreasing sequence of nonnegative continuous func- 
tions f‘*(x) such that /f{©) < < and sup f(x) > 1 on Z. By Step 1, every 


m rl 


fe L andif® — if. Therefore, by Corollary 3 to Levi's theorem, 
Z is a set of measure zero relative to /. Conversely, let Z be a set of mea- 
sure zero relative to the integral /. Then, given any < > 0, there exists a 
nondecreasing sequence of nonnegative step functions 4x) such that 
Ih®) << and sup A(x) = lonZ. By Step 2, everyh®) € L and [h® = 
Th). Therefore, by the same corollary, Z is a set of measure Zero relative 
to J. Thus we have shown that the phrase “almost everywhere” means 


the same thing in the two spaces L and L. 


Step 4. Monotonic passages to the limit and formation of differences. 
Suppose fe L', so that f is the limit (almost everywhere) of a non- 
decreasing sequence of step functions /,,, with bounded integrals /A,,,. 
Then the integrals /h,, = /h,, are bounded, and hence, by Corollary | 
to Levi's theorem, fe L and /f — If. Conversely, suppose f € L', so that 
fis the limit (almost everywhere) of a nondecreasing sequence of con- 
tinuous functions f,,, with bounded integrals If,,. Then the integrals 
If,, — If,, are bounded, and hence, by the same corollary, f¢ L and 
If — If. Finally, taking differences, we find that £ contains every function 
fe L, and vice versa, with If = If. This completes the proof. 


2 Details on the construction of such functions f,,(x) will be given on p. 98. 


S6 THE LEBESGUE INTEGRAL IN N-SPACE CHAP. 3 


PROBLEMS 


1. Suppose f(x) equals 1 on an open set G © [a, 6] and 0 on the complement 
of G. Show that f(x) belongs to L*fa, 6]. 


Hint. If G-U A,, then f(x) = > Aj(x), where A,(x) — | on A, and 0 
outside A,. j=1 j=1 


2. Construct an open set G © [a, 6) such that the function f(x) equal to 0 on 
G and | on its complement does not belong to L*{a, 5]. 


Hint. Choose 
G = UA; = U &,, 8), > (8; — 4%) <b -a, 
j=1 j=l j=l 


where G is such that every point x € [a, 5] is a limit point of G. 


3. Suppose a summable function f(x), defined on the closed interval [a, 5], 
vanishes outside [«,@], where a <. « <: 8 < 6, so that f(x) can be “shifted.” 
Prove that f(x) is ““continuous in the mean,”’ in the sense that given any e >- 0, 
there exists a 8 > 0 such that 


f(x + Ax) —f(dil<e if |Ax| <8, (5) 
where || || denotes the Z-norm. 


Hint. Show that the set of all f satisfying (5) is closed in L. Then verify 
(5) for step functions. 


4. Consider the function f(x) — x* sin (x®), defined on the half-open interval 
(0, 1]. For what values of the real parameters « and 8 is f(x) 


a) Riemann-integrable (in the improper sense); 
b) Lebesgue-integrable ? 


Ans. a) For « > —! — ||; b) For « > —1—8 (@>0) and «> —!1 
(8 <0). 


Comment. For B < 0,8 —1 <a < —l, the function f(x) has an improper 
Riemann integral, but no Lebesgue integral. 


5. Let f(x) be the same as in the preceding problem, but this time defined on 
the infinite interval (1, 0). For what values of « and 8B is f(x) 


a) Riemann integrable (in the improper sense); 
b) Lebesgue integrable? 


Ans. a) For« < |B] — 1; b) For« < —1(8 > O)anda < —1 —6(6 < 0). 


Comment. ForB > 0, —1 <« < —1 + 8, the function f(x) has an improper 
Riemann integral, but no Lebesgue integral. 


PROBLEMS THE LEBESGUE INTEGRAL IN 7#-SPACE 5/7 


6 (A nonmeasurable function). To simplify the construction, imagine the interval 
[0, 1] wrapped around a circle F of circumference 1, and measure all distances 
along the circle. Two points € and y of the circle I’ will be called “‘Jike’’ if the 
distance between them is rational, and ‘‘unlike’’. if the distance is irrational. 
The countable set of all points like a given point (i.e., at rational distances 
from the point) will be called a ‘‘class.’’ The set of all points of the circle is the 
union of an uncountable collection of disjoint classes. Let f(x) be a function 
defined on I’ which for every class takes the value 1 for one member of the class 
and the value 0 for all other members. Show that f(x) cannot be measurable. 


Hint. If f(x) is measurable, then it is summable, and so are all its “‘trans- 
Jates”’ f(x + A), with [f(x + h) = If(x). Show that 


LEE SO: Ts 
where r ranges over all rational numbers. By Levi’s theorem, 
> f~e+n=3 fe) = =1, 
rT r 
which is incompatible with either Jf > 0 or [f = 0. 


7. Consider the double integrals 
io 8) fe @] 2 =— 
a) [, ip e*”sinxsinydxdy; _b) l caueays ve dx dy. 


Show that the corresponding iterated integrals exist, for either order of integra- 
tion, and are the same in Case a but different in Case b. Show that nevertheless 
the double integrals do not exist, i.e., that the integrands are not summable. 
Why doesn’t this contradict Fubini’s theorem (more precisely, its converse, as 
stated in Remark 1, p. 43)? 


Hint. The integrands do not have constant sign. 


8.2 We start from the following two postulates of set theory: 
1. The points of the continuum C = (jO< E< 1} can be ordered by a new 
relation -3 such that every subset has a least element (the well-ordering 
hypothesis). 
2. With this ordering, every subset {£:& 3 &}, for arbitrary & € C, has no 
more than countably many elements (the continuum hypothesis). 


Consider the set E of all points (x, y) of the squarreO< xc 1O<y<l 
satisfying the ‘inequality’? x 3 y. Show that every horizontal cross section of E 
(i.e., every intersection of E with a horizontal straight line) contains no more 
than countably many points, while the same is true of the complement of every 
vertical cross section. Show that the characteristic function A(x, y) of the set E 
(i.e., the function equal to 1 on E and 0 otherwise) satisfies the relations 


ly {Iyh(x, y)} = 0, Ty lyh, y)} —_ 
Why doesn’t this contradict Remark I, p. 43 (the converse of Fubini’s theorem) ? 


Hint. The function A(x, y) is, of course, nonmeasurable. 


3 Due to G. P. Tolstov. 


Part 2 


THE STIELTJES INTEGRAL 


4 


THE RIEMANN-STIELTJES INTEGRAL 


4.1. Blocks and Sheets 


In this chapter, we introduce the Riemann-Stieltjes integral, which 
generalizes the ordinary Riemann integral in n-space and allows us to take 
account of ‘‘spatial inhomogeneity” (in a sense that will soon be apparent). 
As a first step, we modify the concept of a block, defined in Sec. 1.1 as a 
set of points x = (x,,..., X,) satisfying the inequalities 

OS Ns Dias, = Xe = Oe (1) 


In Part | only the size and volume of a block mattered, so that it made no 
difference which of the symbols < or < appeared in (1). However, we now 
intend to replace the volume by the more general notion of a “Stieltjes 
quasi-volume,”’ which can “concentrate”” on the boundary of a block or 
even at individual points. (In fact, we shall even allow quasi-volumes to 
take negative values.) Therefore, from now on, we must be more careful 
about just what is meant by a block. 

As before, the basic block B is a set of points of the form (1), but now 
we permit B to be infinite (we were moving in this direction in Sec. 3.2), 
i.e., one or more of the numbers a, },,...,4,, 6, are allowed to be infinite 
(— oo for the a;, + oo for the 5,), and correspondingly, B may contain points 
at infinity. Thus the basic block is always compact in the “natural topology,” 
and hence any sequence of points in B contains a subsequence converging 
to a finite or infinite limit.? 


1In this regard, we observe that an infinite block can always be transformed into a 
finite block by the substitution x, = tan ¢,(j — 1,...,.), where one or more end points of 
the new block B*, consisting of points — — (£1,..., a), is equal to +7/2 (for further 
details, see Sec. 4.4.3). 


6| 


62 THE RIEMANNGSTIELTJES INTEGRAL CHAP, 4 


By a sheet we mean the intersection of B with some hyperplane of dimen- 
sion n — | parallel to a coordinate hyperplane (i.e., with some surface 
x; = const). Note that some or all of the points of a sheet may be points 
at infinity. Given a basic block 

BS ay ot a (2) 

the set 

Dy = {x1 ay <q < Dg, Xp py ey dg KS Hy < OO, 
is called the k’th sheet of the lower boundary of B, and the union of all the 
lr, (kK =1,..., 7) is called the (complete) lower boundary of B. Similarly, 
the set 

DON a Ki Dine he Dp ag %, = 0, 
is called the k’th sheet of the upper boundary of B, and the union of all the 
TD (k = 1,...,m) is called the (complete) upper boundary of B. The set 
of all points at infinity belonging to either the lower or the upper boundary 
of B will be called the improper boundary of B. 

Next let «,, 8),...,«,, 8, be any numbers satisfying the inequalities 


axa <8; <b, (j=1,...,n), 

where a, and 8, are the same as in (2). Then by a subblock of the basic block B 
we mean a set of the form 

B= {x1 a <%1 < Bi. +5 ty << X, < By} (3) 
if «; > a, for every j, but if «; =a, for any /, we replace the inequality 
a; <x; by a; < x,;. In other words, B contains any point of the lower 
boundary of B which is a limit point of B. (Note that the definition allows 
the basic block to be a subblock of itself!) The word “‘block,’’ without 
further qualification, can mean either the basic block B or a subblock of B. 
The empty set @ will also be regarded as a block. 


Remark. It should be noted that the definition of a subblock Bc B 
depends on whether or not the intersection of the closure of B with the lower 
boundary of B is empty (in fact, the intersection is always adjoined to B). 
This could be avoided by defining all blocks, including the basic block 
itself, as “half-open”’ sets of the form (3). Admittedly, this choice achieves 
a certain notational simplicity at this point, but it would become unmanage- 
able later, primarily because the use of Dini’s lemma (p. 54) depends on B 
being compact (see Prob. 6, p. 108). 


The following two properties of blocks are easily verified: 


1) If B and B’ are blocks, then so is their intersection BB’; 
2) If B, and B are blocks B, © B, then there exist blocks B,,..., B,, 
such that B= B,UB,U+::UB., 


where the blocks B,, B,,..., B,, are (pairwise) disjoint. 


SEC. 4.2 THE RIEMANNG-STIELTJES INTEGRAL 63 


Remark. In the language of Sec. 7.1, the set of all blocks (contained in 
a basic block B) forms a semiring. Note that Property 2 fails if blocks are 
defined as in Part 1, with < replaced by < in (3). 


Finally, we introduce the notion of a dense set of blocks. Given a basic 
block (2), in each closed interval a, < x, < b,, we choose a dense subset 
E,, (k = 1,...,n), which in particular contains the end points a, and b,. 
Then the set Q of all blocks of the form (3) with end points a, 8;,..., %,5 Bp 
belonging to the sets F,,..., £,, respectively, is said to be dense in B. 
In this way, every collection of dense sets F,,..., E, generates a dense set 
of blocks BC B. 


4.2. Quasi-Volumes 


Given a dense set of blocks Q in the basic block B (which may be infinite), 
suppose a real number a(B) is associated with every block Be Q, where 
o(B) is additive in the sense that 


o(B) = o(B,) + +++ + o(B,) 


if Bis a union of disjoint blocks B,,...,B,, [in particular, o(@) = 0]- 
Then the function o(B) is called a (Stieltjes) quasi-volume. It is important to 
note that in general o(B) is signed, 1.e., can take values of either sign. 

If o(B) > 0 for every block Be Q, the quasi-volume is said to be non- 
negative. A quasi-volume o is said to be of bounded variation if, given any 
partition [I of the block B into a set of disjoint blocks B,,..., B,, € Q, 
the inequality 


> |9(B;)| < C (4) 
j=1 
holds, where the constant C does not depend on the choice of the blocks 
B,,..., B,,. The smallest value of the constant C figuring in (4), 1.e., the 
quantity 


Va(s) = sup > |o(B,I 


where the Jeast upper bound is taken with respect to a// partitions of the 
block B, is called the total variation of the quasi-volume co in the block B. 
If the quasi-volume oa is nonnegative, 1.e., if o(B) = 0 for every Be Q, the 
condition (4) reduces to finiteness of o(B). 

We now give some examples of quasi-volumes, defined for all subblocks 
Bc B: 


Example 1. Let 
o(B) = s(B), 


where s(B) is the volume of the block B in the ordinary sense (and B is finite). 


64 THE RIEMANNG-STIELTJES INTEGRAL CHAP. 4 


Example 2. Let 
o(B) =| @(x) dx, 


where g(x) 1s a function summable over B. If g(x) > 0, then o(B) is a non- 
negative quasi-volume. In the general case, where g(x) has variable sign, 
a(B) is of bounded variation, since 


¥ eB => | [20 dx| < fi lex) ax 


if B= B, U:::UB,. (Supply some missing details.) 


Example 3. Given a sequence of points c,...,¢,,... in the basic 
block B and a sequence of real numbers g,,...,2,,... Such that 


io 9] 


> len =g< 0, 


let o(B) equal the sum of all the g,, such that the corresponding points c,, 
belong to the block B. If all the g,, > 0, then o(B) is nonnegative. In the 
general case, where the numbers g,, take either sign, o(B) has total variation 
no greater than g. 


4.3. Quasi-Length and the Generating Function 


In the case n = ], the block B reduces to a closed interval [a, 6], where 
one or both end points can be infinite. A subblock B © B is then a half-open 
interval (a, B] if « > a or a closed interval if « = a. Thus, in writing («, §], 
we do so with the understanding that ( is to be replaced by [ if « = a. To 
specify a dense set of blocks Q, we choose a set of points E which is dense 
in [a, 6) and contains the end points a and J. Then the block B = (a, 6] 
belongs to Q if and only if both points « and B belong to EF. In the case of 
intervals, it is more natural to refer to quasi-volume as quasi-length. With 
every quasi-length ofa, 8], a, 8 ¢ £, we can associate a function of the real 
variable x, defined by 

o[a, x] for xEE,x a, 
F(x) = 
0 for x= 4a. 
Then, from a knowledge of the function F(x), we can find the quasi-length 
of every (a, B] € Q, L.e., 


o(2, B] = ofa, B] — ofa, a] = F(B) — Fla). (5) 


The function F(x) will be called the generating function of the quasi-length 
o (synonymously, the distribution function of sc). 


SEC. 4.3 THE RIEMANN-STIELTJES INTEGRAL 65 


Conversely, let F(x) be any function which is finite on E and vanishes 
for x = a. Then F(x) can serve as a generating function, since the interval 
function o(«, 8] defined in terms of F(x) by using formula (5) is automatically 
additive. 

Obviously, the quasi-length o(«, 8B] is nonnegative if and only if the 
corresponding generating function F(x) is nondecreasing on E. We now 
interpret the definition of a quasi-length of bounded variation in terms of 
the generating function. In the present case, a partition of the block B = [a, 5] 
into disjoint subblocks B,,..., B,, corresponds to a partition of the closed 
interval [a, 6] into half-open subintervals (except for the first), 1.e., 


[a, b| =o [Xo x4] U (x1, X9] Urs U (Xm ae P 


where €@ =X) <%1 <°°' << Xy_1<%, = 65 and the points Xx, %4,..., 
Xm-1» Xm all belong to E. Then the inequality (4) on p. 63 takes the form 


m 


> lB) =} IF) — FW < ¢ (6) 


r 


where F(x)) = 0. A function F(x) defined on the set E satisfying the 
inequality (6) for any choice of the integer m and the points X9,..., Xm 
in E will be called a function of bounded variation. Thus a quasi-length o 
is of bounded variation if and only if its generating function F(x) is of 
bounded variation. The smallest value of the constant C figuring in (6), 1.e., 
the quantity 
Vi(F) = sup |F(x;) — F(x;)I, 
a 

where the least upper bound is taken with respect to all partitions of the 
interval [a, b], is called the total variation of the generating function F in 
the interval [a, bj. 


Remark. The generating function can also be defined for an n-dimensional 
quasi-volume, although in this case it is not a particularly useful concept. 
For simplicity, we consider the case n = 2. Then the basic block B is a set 
defined by inequalities of the form 


a,<X%,< db, Ay < X_ < Dy. 


Suppose £, is dense in [a,, ,] and contains the points a,, 5, and similarly 
for E,. Let Bif? be the block defined by 


OX < 6, 2 << X22 < Bo, O15 B, E £y, Kos B. E Eo, 


where «; < x; is replaced by a; < x, if a; = a; (j= 1, 2). If we define the 


function 
F (xy, X2) = o( Baia) (7) 


66 THE RIEMANNGSTIELTJES INTEGRAL CHAP. 4 


then the formula 
o(Bz's?) = o( Bai?) — o(Bata?) — o(Baiaz) + o(B2223) 
aa F(6;, Be) — F(a, Be) os F(B,, Xe) + F(a, X) 
allows us to reconstruct the quasi-volume o(B) of any block Be Q. For this 


reason, the function (7) is called the generating function of the quasi-volume 
o(B). 


4.4. The Riemann-Stieltjes Integral and Its Properties 


We now introduce a far-reaching generalization of the concept of the 
Riemann integral, studied in Chap. 1. 


4.4.1. Construction of the Riemann-Stieltjes integral. First we assume that 
the basic block B is finite and that a quasi-volume o(B) of bounded variation 
is defined on some dense set Q of blocks B < B. Let f(x) be a function defined 
on the block B. Consider an arbitrary partition IT of the block B into disjoint 
subblocks belonging to Q, 1.e., B= B, U-:: UB,, and as in Sec. 1.1, 
let d(I1) denote the largest size of the blocks B,,..., B,,. Choosing a point 
€, in each block B;, we form the Riemann-Stieltjes sum 


Sn(f) => SE)aB,) (8) 


Let IJ,,...,1J,,... be a sequence of partitions such that d(II,) ~ 0, and 
suppose the sequence S,, (f) has a limit as p — oo, which is independent of 
the choice of the sequence II, [provided only that d(I1,) 0] or of the 
points €;€ B;. Then the limit is called the Riemann-Stieltjes integral of the 
function f(x) [over the block B] with respect to the quasi-volume o, and we 
write 


1af = |g f(9 (dx) = lim Sp(f). 


Correspondingly, the function f(x) 1s said to be Riemann-Stieltjes integrable 
(over the block B) with respect to the quasi-volume o. 


THEOREM |. Jf f(x) is continuous in the block B, then f(x) is Riemann- 
Stieltjes integrable over B with respect to any quasi-volume o of bounded 
variation. 


Proof. By hypothesis. given any « > 0, we can find a partition 


I] — {B,} of the basic block so fine that | f(x’) —- f(x")| < ¢ for all x’, 
x" € B;. Such a partition will be said to belong to e. Given a partition 
belonging to « and a finer partition I = {B,,}, where B,, © B, for every 
k, let &,, be an arbitrary point of the block B,,. Then clearly 


SS) — £85) = Six 


SEC. 4.4 THE RIEMANN-STIELTJES INTEGRAL 6/7 


has absolute value less than e. By definition, 


Saf) = TIE)CB), Saf) = TIEwo(By) 


which implies 


Sn) — Saf) = s[s FE ix)O(B 5x) — f)o(B)| 


-[p[peuoe 


It follows that if [I and II’ are any two partitions belonging to ¢, and 
if I]* is the new partition consisting of all intersections of blocks of II 
with blocks of II’, then 


ISnG) — Sn) < eVp(o), — Sn(f) — Sne(f)] < eV po), 


and hence 


<> |o(B,,)| < eVg(o). 


ISn(f) — Sn(f)| < 2eVp(0). (9) 


Now fet II, be a sequence of partitions belonging to a sequence of 
numbers ¢,, — 0. Then, applying the above argument to the partitions I], 
and II,_,, we see that the numbers Sir, ( f) form a Cauchy sequence, and 
hence tend to some limit J, f. If Il, is another sequence of partitions 
belonging to the numbers ¢,, then, applying the same argument, this time 
to the partitions II, and II, we find that 


Sa) _ Sn(f) — 0. 


Thus /, f is independent of the choice of the sequence II,, and the theorem 
is proved. We note, in passing, that the inequality 


[,Fo(dx) —¥ fE)o(B,)| < Valo) (10) 


holds for any partition belonging to the number «. 


Remark, Let o(a, 8] be a quasi-length of bounded variation. Then 
Theorem | implies the existence of the Riemann-Stieltjes integral 


If = 


(a,b) 


f(xo(dx) = lim > FE )o(X 45 5] (11) 


for every function f(x) continuous in the interval [a, 5]. Here, of course, 
€, is an arbitrary point in (x,_;, x,], and in the limit on the right, the maximum 
length of the subintervals (x,_,, x;] is made to approach zero. Let F(x) be 
the generating function of the quasi-length o(«, BJ. Then another way of 
writing the integral (11) is 


Jo pf (2) AFCO), 


68 THE RIEMANN-STIELTJES INTEGRAL CHAP. 4 


suggested by the relation 


lim ¥ FE Jo(x)-1. x4] = him fE MF) — FO] 
[again with F(x,) = F(a) = 0]. In particular, if f(x) = 1, we have 


J wy 1 4F(X) = ola, b] = F(b) = F(b) — F(a), 
in keeping with elementary calculus. 


4.4.2. Further properties. The following two properties of the Riemann- 
Stieltjes integral obviously hold in the general case (i.e., without any special 
assumptions about the continuity of the integrand): 


1) If f,(x) and f,(x) are Riemann-Stieltjes integrable, then so is «, fi(x) + 
a» fo(x), where a, and «, are arbitrary real numbers, and moreover 


[lehicx) + ay fo(x)]o(dx) = ae | fulx)o(dx) ao asf falxdo(dx). 


2) If f(x) is Riemann-Stieltjes integrable and if | f(x)] < M for all x €B, 
then 


| |. f)o(dx) | < MVp(o). (12) 


In some cases, the Riemann-Stieltjes integral can be expressed in terms 
of the ordinary Riemann integral. Thus consider Example 2, p. 64, where 


o(B) = |. e(x) dx, 


and suppose the summable function g(x) is itself continuous in B. Then, 
given any continuous function f(x), we have 


[,f@e(dx) = [fedex ax. (13) 


In fact, if I] = {B;} is a partition of B belonging to < for both f(x) and 
f(x)g(x), then, according to (10), 


<q eVp(o), 


[, fso(dx) — > FE, )o(B,) 


< es(B), 


[pelo ax —¥ fEdeE)s(B) 


where s(B) is ordinary volume, and hence 


| yf (E,)o(B;) — >) (E,)e(€;)s(B;) 


LIE) p80 — g(6;)] ax| < 5(B) max | f(x)I. 


SEC. 4.4 THE RIEMANN-STIELTJES INTEGRAL 69 


But then 


| nf (x)o(dx) —[,, FCoda(x) dx | <e| ¥a(o) + s(B) + s(B) max |f(x)]), 
which implies (13), since ¢ is arbitrary. ° 


4.4.3. The case of infinite B. The above construction can easily be carried 
over to the case where the block B is infinite. It is only necessary to be careful 
about the meaning of an “arbitrarily fine partition.”” Suppose we make the 
transformation x, — tan &, already mentioned in footnote 1, p. 61. Then B 
goes into a finite block B* in the space of points ¢ = (4,...,6,), and 
we say that the partition II of the block B is “‘arbitrarily fine’ if the corre- 
sponding partition II* of the block B* is arbitrarily fine in the usual sense, 
i.e., 1f d(II*) can be made less than any preassigned < > 0. Moreover, a 
function f(x) = f(x;,...,X,) defined in B is said to be continuous in B if 
the function f(tan &,..., tan &,) is continuous in B*. In other words, if T° is 
the improper boundary of B, then f(x) is continuous in B — I’ and can be 
continuously extended onto I’. Once these conventions have been established, 
we see at once that Theorem | remains valid for an infinite block B. 

As we now Show, Theorem | still holds even if f(x) cannot be continuously 
extended onto the improper boundary of B, provided the conditions on o 
are strengthened somewhat. First we need the following 


DEFINITION. Let o(B) be a quasi-volume of bounded variation defined 
ona dense set of blocks Q in an infinite block B. Then o(B) is said to be 
continuous at infinity, if, given any = > 0, there exists a finite block 
B, < B, B, € Q such that : 

> |o(B,)| <e (14) 
j=l 


for arbitrary disjoint blocks B,; contained in B — B, (B;€ Q).’ 


Example. Let B = [0, 0] and consider the quasi-volume o(2, B) with 


generating function 
e* for O< x < o, x rational, 
F(x) = 
for x= oo. 

Then o is of bounded variation, but not continuous at infinity, since, for 
example, o(x, oo] > 3 for arbitrarily large rational x. However, to make 
o continuous at infinity, we need only “correct” F(x) by setting F(oo) = 0. 

We are now in a position to prove the promised refinement of Theorem |: 


THEOREM 2. Let B be an infinite block with improper boundary V, 
and suppose f(x) is continuous and bounded in B — I’. Then f(x) is Riemann- 
Stieltjes integrable over B with respect to any quasi-volume o of bounded 
variation which is continuous at infinity. 


2 It is important to note that the blocks B; can be infinite. 


70 THE RIEMANN-STIELTJES INTEGRAL CHAP. 4 


Proof. The theorem has just been proved (without the extra condi- 
tion on o) for the case where f(x) can be continuously extended onto I’. 
Therefore the reader should now have in mind an example like f(x) = 
cos x (—00 < x < 00), B = [—0o0, oo], where f(x) cannot be continu- 
ously extended onto I’. Given any < > 0, let B, © B, B, € Q be such that 
the inequality (14) holds. Observing that any partition of the basic block 
B generates a partition of B, (consisting of the intersections of B, with the 
blocks of the partition), let Il = {B,}and II’ = {B‘} be any two partitions 
of B such that both partitions {B,B,} and {B‘B,} belong to < (in the sense 
defined on p. 66). Such partitions exist, since f(x) is uniformly con- 
tinuous in B,. Let €, be any (finite) point of Bi and & any point of By. 
Then, according to formula (9), p. 67, 


< 2eVp(o), (15) 


df (§;)o(B,;B,) — 2s (63)o(B5B,) 
where Vg(c) is the total variation of o. Moreover, 
Xf (§;)o(B;) — df (€;)0(B;B,) 
= df (§,)[o(B;) — o(B;B,)] = XJ ,)o(U B,), 
where the B,, are suitable disjoint blocks contained in B — B, (By, € Q) 


whose union equals B, — B,B, if B; © B,, all the B,, are empty). It 
follows that 


> f(§))o(B;) — 2 f(E;)o(B;B.)| < M 2 |9(B x)| < Me, (16) 


where 
M = sup |f(x)|, 
re B-T 
and similarly 


< Me. (17) 


S Eat) — ¥ FE )o(B;B) 


Combining (15), (16) and (17), we obtain 


SEB) — > /(E)a(B}| < 2e[Vg(o) + M}. 


Since this effectively generalizes formula (9), p. 67 to the present case, 
the rest of the proof is identical with that of Theorem 1. 


Remark. The class of functions which are integrable with respect to a 
given quasi-volume o also contains discontinuous functions, and criteria for 
Riemann-Stieltjes integrability, analogous to those given in Chap. 1, could 


SEC. 4.4 THE RIEMANN-STIELTJES INTEGRAL 7] 


be established. However, we shall not bother to do so, since the Lebesgue- 
Stieltjes integral, to be constructed later (see Chap. 5), leads to a much 
larger class of integrable functions than the Riemann-Stieltjes integral. 


4.4.4. Equivalent quasi-volumes: a preview. There arecases where different 
quasi-volumes o, and og, defined on different dense sets of blocks Q; = Q(o,) 
and Q, = Q(«s,), or even on the same set 0 — Q, = Qs, lead to the same 
Riemann-Stieltjes integral, in the sense that 


[,f@eadx) =} fxox(dx) (18) 
for every function f(x) continuous in B. 


Example. Let n = | and suppose the quasi-length o,(«, 8] equals zero 
for every half-open interval («, 8] which does not contain a given point c 
(a <c¢<b) or else contains c as an interior point. Moreover, suppose 
o,(a, c] = +1 for every « <c, while o,(c, 6B] = —1 for every B > c. On 
the other hand, let o,(«, 8] be the quasi-length identically equal to zero. 
Then it is easily verified that 


[a fodaldx) =|, )ox(dx) = 0 
for every function f(x) continuous in [a, 5). 


Two quasi-volumes o, and oy, satisfying the condition (18) are said to be 
equivalent. The subject of equivalent quasi-volumes will be studied in detail 
in Sec. 5.7. For the time being, we merely anticipate some results showing 
the reader what Is at issue: 


1) Every quasi-volume o of bounded variation is equivalent to a quasi-volume 
6, also of bounded variation, which is defined on all blocks B — Band 
is upper continuous in the sense that 

o(B) = lim o(B,,) 
for any sequence of blocks B,, converging downward to the block B 
(symbolically B,, % B). 

2) Two equivalent upper continuous quasi-volumes coincide on all blocks 
Bc B. 

3) The quasi-volume o is determined from the quasi-volume o by the 
formula 

o(B) = lim o(B,,), 


mF WH 


where B,, € Q(c) and B,, ™ B. 


> Roughly speaking, the boundaries of B,, move downwards, approaching those of B 
from above, in a sense to be made precise in Sec. 5.5. 


72. THE RIEMANN-STIELTJES INTEGRAL CHAP, 4 


4.5. Essential Convergence. The Helly Theorems 


Given a sequence of quasi-volumes o,,...,6,,... defined on the sub- 
blocks of a basic block B, we would like to define the concept of convergence 
o,, > 6 In such a way that the formula 


lim | ,fon(dx) =|, f(x)a(dx) (19) 


holds for every function continuous in B. The appropriate definition 
turns out to be the following: Given a sequence of quasi-volumes 6),..., 
o,,,--- and another quasi-volume o, all defined on the same dense set of 
blocks Q in B, we say that o,, is essentially convergent to o if 


1) The total variations Vg(o,,) form a bounded sequence; 
2) For every Be Q, 
lim o,,(B) = o(B). 
Clearly, the quasi-volume o(B) is of bounded variation, like the quasi- 
volumes o,,(B) themselves. In fact, given any partition of the basic block B 
into disjoint blocks B,,...,B, € Q, we have 


Dp D 

> |o(B;)| = lim > Io,,(B,| < C, 
j=1 mo j=1 

and since this estimate does not depend on the choice of the partition, the 

assertion is proved. 


THEOREM 3 (Helly’s convergence theorem). If the sequence of quasi- 
volumes 61,...,6m,--. is essentially convergent to the quasi-volume o, 
then the limit relation (19) holds for every function f(x) continuous in B. 


Proof. Given any < > 0, let Il = {B,}, B, € Q be a partition of B 
so fine that II belongs to ¢ in the sense defined on p. 66. Then, according 
to formula (10), we have 


< eVp(o), 


| [, f(xo(dx) — ¥ fE,)o(B,) 


= 


[, Flxo,(dx) — Sf E,en(B,) 


< eVa(o,,). 


Moreover, let N be an integer so large that 


| > f(E;)o(B;) fae > (Ss) Fn(B;) | <¢é 


SEC. 4.5 THE RIEMANNGSTIELTJES INTEGRAL 7/3 


for all m > N. Combining the last three inequalities, we obtain 


Lf f(xo(dx) —[ f(x)o,(dx)| <elVal(o) + Valen) + 1) < QC + Ue 
(20) 
for all m > N, where 
C = sup {Vp(c), Vg(cx), Vp(o2), . . -}- (21) 
But this implies (19), since ¢ is arbitrary and C is independent of m. 


Remark |. Formulas (19) and (20) can be written more concisely as 


lim Io, f = Iof 
and _ 
lf — I, f |< (2C + Ie. (22) 


Remark 2. Theorem 3 can be generalized somewhat by allowing the 
function f(x) to depend on m. In fact, if o,, is a sequence of quasi-volumes 
converging essentially to a quasi-volume o and if f,, is a sequence of con- 
tinuous functions converging uniformly to a (continuous) function f, then 

lim I... fn = Toh 


mo 


This is an immediate consequence of (22) and the estimates 


eA — Sm)! < oman y= Tel 


Lay = Ja) < aes lt Seni 


involving the same constant C defined by (21). 


Remark 3. Suppose the block B is infinite, with improper boundary [’. 
Then Theorem 3 remains valid if f(x) is continuous and bounded in B — I, 
even if f(x) cannot be continuously extended onto I’, provided the sequence 
O,, 1S equicontinuous at infinity in the following sense (which is the natural 
generalization of the definition on p. 69): Given any < > 0, there exists a 
finite block B, < B, B, € Q such that 


D3 |o,,(B;)| <e¢ (23) 


for arbitrary disjoint blocks B, contained in B — B, (B; € Q) and all m. In 
fact, let B, < B, B,€ Q be such that (23) holds, and let Il = {B,} be a 
partition of B belonging to the number ¢. Then, according to formula (16), 
p. 70, 


> £(5)6m(B;) —> f(Ej)Fm(B;B.) < Me, 


74 THE RIEMANN-STIELTJES INTEGRAL CHAP. 4 


where 
M = sup (x)J. 
Moreover, 


E/E )o(B,) —E fE)o(B,B) | < Me, 


since passage to the limit m — oo in (23) gives 


Vo(B,)|—e. 
1 
It follows that 


> f(E;)o(B;) —)> f£(§;)on(B,) 


I 


2Mse + > f(8;)o(B;B,) — > £(E;)on(B;B,) 


= 2Me 4- 


> f(E)lo( BB.) ~~ o,,(B;B.)] | < (2M = I)e, 
provided om is sufficiently large. Letting a(I1) + 0, we obtain 


of — Ip, fl < (2M + Ne 


for sufficiently large mi, where the existence of the integrals is guaranteed 
by Theorem 2. Therefore 


liml, f= IS, 


m7 oc 


as required. 


THEOREM 4 (Helly’s selection principle). Let“ {o,BY\ be an 
infinite family of quasi-volumes, all defined on the same dense set of blocks 
O in B, where 

Valo.) < C 


for every o,¢™%. Then % contains a sequence o,,(B) which is essentially 
convergent to a quasi-volume o(B) of bounded variation. 


Proof. Clearly, Q contains a sequence of blocks B,, Bs... Which ts 
dense in B. Since the set of numbers 9,,(B,) ts bounded, there Is a sequence 
of quasi-volumes o,,,c¢ % such that the numerical sequence o,,,(B,) is 
convergent. From the sequence o,,, We can select a subsequence oz,, such 
that o2,,(B2), as well as o.,,(8)). is convergent. Continuing this construc- 
tion, given any Integer p, We can find a sequence of quasi-volumes @,,,, 
such that the numerical sequences o,,,(B8)).....0,,,(8,) all converge 


SEC. 4.6 THE RIEMANN-STIELTJES INTEGRAL 7/5 


(as m »-+.). Therefore the diagonal sequence 5, Gina, CONVErges on 
all the blocks B,, B,,... The (essential) limit o(B) of the sequence <,,(B), 
defined on all the blocks B,. B,,..., obviously represents a quasi- 
volume. Since o(B) is of bounded variation, by the argument given on 
p. 72, the proof is now complete. 


Remark. bor the casen — |. the above results can all be paraphrased in 
terms of generating functions. Thus, given a sequence of generating functions 
fy,....f,,.... and another generating function £, all defined on the same 
dense set fe” [a. J. we say that F,, 15 essentially convergent to F if 


1) The total variations V/(F,,) form a bounded sequence; 
2) For every x € E, 
lim F(x) = F(x). 
Clearly. the limit function f(x) is of bounded variation, like the generating 
functions F(x) themselves. According to Helly’s convergence theorem, if 
the sequence of generating functions F(x) ts essentially convergent to the 
function F(x), then 
: “h ob 
lim |" f(x) dF 0) =|" f(x) dF) (24) 


Mm TF. 


for every function continuous tn the interval (a, b|, while, according to Helly’s 
sclection principle, if. % fF 0x); is an infinite family of generating functions, 
all defined on the same dense set LE ~ [a, 6]. where V'(F,) ~ C for every 
Fi« #, then A contains a sequence F(x) which is essentially convergent 
to a generating function F(x) of bounded variation. Moreover, (24) continues 
to hold if the function f(x) itself depends on m, or if the interval [a, 4] 1s 


infinite. 


*4.6. Applications to Analysis 


The theorems on passage to the limit inside a Stieltjes integral have 
numerous applications to mathematical analysis. We now digress to point 
some of these out. 


4.6.1. Herglotz’s theorem. Consider the problem of finding the general 
form of afunction w f(z) which 1s analytic in the unit disk {z| ~ 1 and has 
a nonnegative real part, j.c.. which maps the unit disk into the right half-plane. 
First we note that w = % - (3 (% 2 0) 15 such a function, and so Is 


et ee Zz 


it 7 
es 


fix) = 


> 


76 THE RIEMANN-STIELTJES INTEGRAL CHAP. 4 


where / 1s an arbitrary real number. In fact, for fixed t and |z| < I, the 
points z, = e’ + z and z, = e*’ — z lie inside the circle [° of unit radius 
with center at the point e**. Moreover, 2, and 7, lie on the same diameter 
of I’, while the origin O lies on I itself (see Figure 1). But, by elementary 
geometry, the whole diameter subtends the angle m/2 at O, and hence the 
segment of the diameter joining the points z, and z, subtends an angle 
a < 7/2. In other words, 


tT 
larg f,(z)| = larg z, — arg z,| < ae 


which implies Re f,(z) > 0, as asserted. 

Remarkably enough, it turns out that every function w = f(z) analytic 
in the unit disk with a nonnegative real part is a “Stieltjes combination” of 
the particularly simple functions f,(z), 0 < t < 2m: 


THEOREM 5 (Herglotz’s theorem). If f(z) is analytic in the disk |z| < 1 
and has a nonnegative real part, then f(z) can be represented in the form 


(2) =| S42 += ar + 18, (25) 


where 6 is a real number and F(t) is a bounded nondecreasing function. 


Proof. As is well known, a function f(z) analytic in the disk |z| < 
r < 1 can be represented in terms of the boundary values of its real part 
u(z) by using Schwarz’s formula‘ 


hee aa ie ee 7 are") dt + if 


Iz) = 


(z| <r). 


The integral on the right can be written in 
the form 


f(z y= fre 72 aR) + 18, 
where 


1 ¢ it 
Ficure | F(t) = a i: u(re) dt 


is a nondecreasing function of ft. Moreover, by a familiar property of 
harmonic functions,® we have 


F(t) < F,(2n) = = [Pucre) de = u(0). 


* See e.g., A. I. Markushevich, Theory of Functions of a Complex Variable, Volume II 
(translated by R. A. Silverman), Prentice-Hall, Inc., Englewood Cliffs, N.J. (1965), Theorem 
5.5, Corollary. 

> L.e., w(0) equals its average on the circle |z| = r (see e.g., A. I. Markushevich, op. cit., 
Theorem 5.6). 


SEC. 4.6 THE RIEMANNG-STIELTJES INTEGRAL 7/7 


Therefore the total variations Vi"(F,), 0 <r <1 form a bounded set. 
Let {r,} be an increasing sequence of numbers in the interval (0, 1), 
exceeding |z| and converging to |. Then it follows from Helly’s selection 
principle that the sequence of generating functions {F, (t)} contains a 
subsequence {F, (t)} which is essentially convergent to a bounded non- 
decreasing generating function F(t). Moreover, for fixed z, the sequence 
of functions 

r,e tz 


, (n= 2s aa) 
r,e@ —Z 


each continuous in the interval 0 < ¢ < 27, converges uniformly in the 
same interval to the limit function 

et a Zz 

a a 
Therefore, by Helly’s convergence theorem 


f(z) = lim |" ratte ar, W270 


n 


= ED) + i8 


gas + 2 
’0 et 


(cf. Remark 2, p. 73), and the theorem is proved. Note that the repre- 
sentation (25) includes the case w = o + i8 (a > 0) if we set 


4.6.2. Bernstein’s theorem. A function f(x) is said to be completely mono- 
tonic in the interval [a, b] if all its derivatives exist and satisfy the inequalities 


(I) >0 (=0,1,2,..) (26) 


for every x € [a, 5]. A nonnegative constant is completely monotonic, and 
so is the function e *“ (« > 0). As we now show,® it turns out that every 
COMIP/Ctehy monotonic function in the interval x > 0 is a “Stieltjes combina- 
tion” of the particularly simple functions e~** (« > Q). 


LemMA |. If f(x) satisfies (26), then 
lim x”f' (x) = 0, (27) 


H ofemaae 2) 


and moreover 
A [Pr fo (x)| dx =f) — flo) <0 (n= 0,1,2,..). 
n! 
(28) 


§ Following B. I. Korenblyum, On two theorems from the theory of absolutely monotonic 
functions (in Russian), Usp. Mat. Nauk, 6, no. 4, 172 (1951). 


78 THE RIEMANN-STIELTJES INTEGRAL CHAP. 4 


Proof. According to (26), every function (—1)"f'(x) is nonin- 
creasing. Therefore, for x > 0, 


Fx) = (—DYL(x) < ; * (-1f (x) dx 


x/2 
2 
= FFM) —FOMRHDI, (HW = 1,2,...), 


and (27) follows by induction, since clearly f(x) — f(x/2) + 0 as x > oo. 
Moreover, integrating 


(= 1) [ xrfimen(x) dx = [? x" [fC dx 
by parts n times, and using (27), we obtain 


“1 a 


nie 


Pare re —_ (1) fo neta 
x" |f arora |, f(s) dx 


= —~|° F(x) dx =f) — flo) < &, 
which agrees with (28). 


LEMMA 2. If 
(1-2) for O< x< a, 
9,(x) = n 
0 for n<x< ow, 
then 
lim 9,(x) = e* 


uniformly for x > 0. 


Proof. Forn > 1, the function |9,(x) — e~*| achieves its maximum 
at the point a, (0 < a, < oo), where 


p,(a,) +e” = 0, 
l.e., Where 


a n—1 
(1 — 2) =e bak (0< a, <n). 
n 
Therefore 
lolx) — €*| < |@n(a,) —E™| = ae < (n=2,3,..,) 
n ne 


for all x € [0, oo], and the lemma is proved. 


THEOREM 6 (Bernstein's theorem). If f(x) is completely monotonic 
in the interval x > 0, then f(x) can be represented in the form 


f(x) = [Pe dF(a) + C, (29) 


where C is a nonnegative constant and F(«) is a bounded nondecreasing 
function. 


SEC. 4.6 THE RIEMANNGSTIELTJES INTEGRAL 79 


Proof. Using (27), integrating by parts 1 times and making certain 
obvious changes of variables, we obtain 


F(x) = feo) = — [27 dt = fra — yn at 
_ oe 


n! a/n 


(1 — ~Von'f ("+ (nt) dt 


=|* on(] a+ [enna (nt) a 
=|? o,(ax) GRA): (He T2.2c3), 


where 9, is the same as in Lemma 2, and 


F(a) == [5D in(nyys (no) dt 


[the integrals are all absolutely convergent because of (28)]. Clearly, 
every F(a) is bounded and nondecreasing, and moreover the total varia- 
tions V,°(F,,) form a bounded sequence, since 


Ve(F,) == Jen(ney" fo (ns)] a 


0 


= - [Prise @l dt =f) — flo) (n= 1,2,...). 
nN. 


Therefore, by Helly’s selection principle, the sequence of generating 
functions F,,(«) contains a subsequence F,,(a) which is essentially con- 
vergent to a bounded nondecreasing generating function F(«). It follows 
from Helly’s convergence theorem that’ 


f(x) — foo) = lim J gy(ax) dF (@) =], €** aF(@) 


f(x) =|" e** dF(a@) + f(), 
which agrees with (29), since obviously f(0o) > 0. The term f(oo) can be 
incorporated into the integral, if we replace F(a) by the generating 
function 


- + F(a) for «> 0), 
G(a) = 
0 for «=0. 


” Recall Remark 2, p. 73, observing that the functions @,(ax) and e~** are continuous 
at oo (for fixed x). 


80 THE RIEMANN-STIELTJES INTEGRAL CHAP. 4 


A closely related concept is that of an absolutely monotonic function, 
i.e., a function f(x) is said to be absolutely monotonic in the interval {a, 5] 
if all its derivatives exist and are nonnegative for every x € [a, 5]: 


f(x) > 0 (n = 0,1,2,...). 
The corresponding version of Bernstein’s theorem is then 


THEOREM 6’. If f(x) is absolutely monotonic in the interval x < 0, 
then f(x) can be represented in the form 


f(x) =|" e dF(a)+ C, 


where C is a nonnegative constant and F(a) is a bounded nondecreasing 
function. 


Proof. If f(x) is absolutely monotonic in the new sense, then f(— x) 
is completely monotonic in the sense of the definition (26). 


4.6.3. The Bochner-Khinchin theorem. A complex-valued function f(x) is 
said to be positive definite in the interval [a, 5} if, given any n real numbers 
X,,...,X,,1n [a, b] (where n itself is arbitrary), then < n matrix || f(x; — x,)Il 
is positive definite, i.e., the quadratic form 


> f(x; — xb sex 


j,k=1 

is nonnegative for arbitrary complex numbers &,...,&, (the overbar 
denotes the complex conjugate). Any nonnegative constant is positive 
definite in any interval, and so is the function e*** (« real), since 

. iX(2, rm je F SS tae = 3 

— tae, tae 
ye Pee = Dees Dea = 
k=1 


no. 2 
Die He, 
j,k=1 j=1 j=1 


It turns out that every positive definite function defined on the real line 
is a “Stieltjes combination” of the particularly simple positive definite 
functions e***. In fact, the celebrated Bochner-Khinchin theorem asserts that 
if f(x) is positive definite and continuous for all x, then f(x) can be represented 
in the form 


> 0. 


f(x) =|” e** aF(a), 


where F(«) is a bounded nondecreasing function. Since, for fixed x, the function 
e‘“* cannot be continuously extended onto the improper boundary of the 
interval [— 00, oo], the proof requires extra care, and, as might be expected, 
considerations like those given in Remark 3, p. 73 play a role. We omit 
the details, which would lead us too far afield.® 


* See G. E. Shilov, Mathematical Analysis, A Special Course (translated and edited by 
J. D. Davis and D. A. R. Wallace), Pergamon Press, Inc., New York (1965), p. 438. 


SEC. 4.7 THE RIEMANNGSTIELTJES INTEGRAL 8| 


4.7. Structure of Signed Quasi-Volumes 


We now study the relation between signed quasi-volumes (1.e., quasi- 
volumes which can take values of either sign) and nonnegative quasi-volumes. 
The perceptive reader will note the analogy between this section and Sec. 
2.11. 


4.7.1. Representation of a signed quasi-volume os as the difference between 
two nonnegative quasi-volumes. As already noted in Sec. 4.2, a nonnegative 
quasi-volume p(B) is always of bounded variation if it is bounded, i.e., if 
p(B) <. ~. (Henceforth it will be assumed that all nonnegative quasi-volumes 
are bounded.) Given two nonnegative quasi-volumes p(B) and g(B), defined 
on the same dense set of blocks Q in B, suppose we form the difference 


o(B) = p(B) — q(B). 


Then o(B), like p(B) and g(B), is obviously additive in the sense of Sec. 4.2, 
and is hence & quasi-volume. Moreover, the quasi-volume 9(B8), which is in 
general signed, is of bounded variation. In fact, given any set of disjoint 
blocks B;,..., B,,, we have 


¥ lo(B,)1 < 3 vB.) + ZalB,) < p(B) + 4B), 


so that the sum on the left is bounded by a fixed constant, as required. We 
now show that the converse 1s also true: 


THEOREM 7. Every signed quasi-volume 5 of bounded variation, defined 
on a dense set of blocks Q in B, can be represented as the difference between 
two nonnegative quasi-volumes, defined on the same set Q. 


Proof. Given any block Be Q, the quantity 


p(B) = sup ¥ o(B,), 


where the least upper bound is taken with respect to all sets of disjoint 


subblocks B;< B(j  1,...,m),° is defined and nonnegative. More- 
over, it is easy to see that p(B), like o(8), is additive, and hence a quasi- 
volume. In fact, let B,..., BS be any set of disjoint blocks whose 
union is B, and let B,,..., B,, be any set of disjoint blocks contained 


in B. Then, on the one hand, we have 


™m m 8 8 


YB.) => So(B,B”) Y So(B,B™) - > p(B), 
k-1 


j=-1 J=1k=1 k=1j=1 


* Here, as elsewhere in Sec. 4.7, all blocks are assumed to belong to the underlying dense 


set Q. 


82 THE RIEMANN-STIELTJES INTEGRAL CHAP. 4 


which implies 
p(B) = sup > o(B,) < 3 p(B’). (30) 
j=1 k=1 


On the other hand, given any « > 0, we can find a set of disjoint sub- 
blocks B®) (j= 1,...,7,) of the block B“) such that 


Vr 
pB™) <> o(BY) +=. 
j=1 


Summing over the index k, we obtain 


> p(B”) <> Y(BY) +e < p(B) +e, 
k=1 =1 j=1 
and hence =m 


s p(B”) < p(B), (31) 


since ¢ 1s arbitrary. Together, (30) and (31) imply 


> p(B”) = p(B), 


1.e., the function p(B) is additive, and hence a quasi-volume, as asserted. 
Finally let 

q(B) = p(B) — o(8). (32) 
Since o(B) and p(B) are quasi-volumes, so is g(B). Moreover, g(B) is 
nonnegative, since p(B) > o(B). The theorem now follows from the 
formula 

o(B) = p(B) — 4(8), (33) 
equivalent to (32). 


4.7.2. Other representations of c. The canonical representation. The 
representation (34) in terms of the nonnegative quasi-volumes p and q is not 
unique. In fact, let r be any nonnegative quasi-volume defined on the blocks 
Be Q. Then, besides the representation (34), we can also write 


o(B) = [p(B) + r(B)] — [q(B) + r(B)]. (34) 
It turns out that (34) is actually the most general representation of the signed 
quasi-volume o as a difference between two nonnegative quasi-volumes. 
To see this, suppose we have any representation 
o(B) = p(B) — q,(B), (35) 
where p, and g, are nonnegative quasi-volumes. Then, given any set B,,..., 
B,, of disjoint subblocks of B, 


m 


3 o(B;) < > pi(B;) < p,(B), 


j=1 


SEC. 4.7 THE RIEMANN-STIELTJES INTEGRAL 83 


and hence 
p(B) = sup 2 o(B,;) < p,(B). 
j= 


Therefore p, = p-+r, where r = p, — p is a nonnegative quasi-volume. 
Moreover, 


qi(B) = p,(B) — o(B) = p(B) — o(B) + r(B) = q(B) + r(B), 


l.e., we have reduced (35) to the form (34), as required. 

At the same time, we find that the representation (33), explicitly con- 
structed in Theorem 7, has a simple characterization in the class of all possible 
representations (35), i.e., the quasi-volumes p and gq figuring in (33) are the 
smallest possible among all that can figure in (35). For this reason, (33) will 
be called the canonical representation of the quasi-volume o. 


4.7.3. Formulas for the positive, negative and total variations. It will be 
recalled that the function p(B) is defined by the formula 


‘ p(B) = sup > o(B,), 


where the least upper bound is taken with respect to all sets of disjoint 
subblocks B; < B. We now derive analogous formulas for the functions q(B) 
and v(B) = p(B) + q(B). 

Since the function q plays the same role in the representation of the 
quasi-volume 

—o(B) = q(B) — p(B) 

as played by p in the representation of co, and since g and p are minimal 
in the sense of Sec. 4.7.2, it follows that 


q(B) = sup | => o(B,)} 


j 
where the least upper bound is taken with respect to all sets of disjoint 
subblocks B; < B. As for the quasi-volume v(B), consider a set of disjoint 
subblocks B; < B such that 

> o(B;) > p(B) —«, (36) 
F) 
and let BY be a set of subblocks such that B equals the union of all the B; 
and BY. Then 


> [—9(B')] = —o(B) + ¥ o(B;) > p(B) — o(B) —e = g(B) —«, 
and moreover, if B, is any of the blocks Bj, BY, 


> |o(B,)| = ¥ lo(B;)| +- Y lo(B5)| > p(B) + q(B) — 2e = v(B) — 2c. 


84 THE RIEMANN-STIELTJES INTEGRAL CHAP. 4 


Therefore 
»(B) < sup > |o(B,)I, (37) 


where the least upper bound is taken with respect to all partitions of the 
block B into disjoint subblocks. 

On the other hand, if B = UL B; is an arbitrary partition of the block B 
jnto disjoint subblocks, then ? 


o(B) = ¥ o(B,) > > lo(B))I, (38) 
since obviously 
|o(B)| < p(B) + q(B) = v(B). 


Comparing (37) and (38), we find that 


v(B) = sup > |o(B;)I, (39) 


where the least upper bound is taken with respect to all partitions of the 
block B into disjoint subblocks. The quantity (39) has already been en- 
countered in Sec. 4.2, where it was called the total variation of the quasi- 
volume o. By the same token, the quasi-volume p is called the positive 
variation of o, while g is called the negative variation of o. Thus the total 
variation v is the sum of the positive and negative variations p and q. 


Remark. The above construction greatly resembles that given in Sec. 
2.11, where we represented a functional of variable sign as the difference 
between two nonnegative functionals. This suggests the possibility of a direct 
connection between the two constructions. Such a connection in fact exists, 
as we shall see in Sec. 5.6. 


4.7.4. Thecasen — |. Jordan’stheorem. Consider the case m = 1, where 
the basic block B is a closed interval [a, bj. Let o(a, 8] be a quasi-length 
defined on [a, 5], with generating function 


F(x) = ofa, x], F(a) = 0. 


Then, as we know from Sec. 4.3, s(x, 8] is of bounded variation in (a, 5] 
if and only if F(x) is of bounded variation in [a, bj, in the sense that 


sup > |F(x,) — F(x,_y)| < ©, (40) 
j=] 


where the least upper bound 1s taken with respect to all partitions 


Q= XS Xp te Xe XG SO. (41) 


SEC. 4.7 THE RIEMANN-STIELTJES INTEGRAL 85 


On p. 65 the quantity (40) was called the total variation of F(x) in the 
interval [a, 6], denoted by V(F). If F(x) is of bounded variation in [a, 5], 
then obviously F(x) is also of bounded variation in every subinterval of the 
form [a, x], a << x < b, and the quantity (40), where the least upper bound 
is taken with respect to all partitions (41) with x,, = x instead of x,, = 8, 
becomes a function of x, which we still call the total variation of F(x) and 
denote by V“(F), Var7(F) or simply V(x). We are now in a position to analyze 
the structure of functions of bounded variation: 


THEOREM 8 (Jordan’s theorem). Every generating function F(x) of 
bounded variation in the interval [a, b] can be represented in the form 


F(x) = P(x) — Q(x), 


where P(x) and Q(x) are bounded, nonnegative and nondecreasing gener- 
ating functions. Moreover, the total variation of F(x) is given by 


V(x) = P(x) + QQ). 
Proof. The corresponding quasi-length defined by 
o(a, 8] = F(8) — F(a) 
can be represented in the form 
o(a, B] = pla, B] — g(a, B], 


where p and q are (bounded) nonnegative quasi-lengths. The rest of the 
proof follows by setting 


P(x) = pla,x], Q(x) = gla, x]. 


Remark I. The functions P(x) and Q(x) are called the positive and negative 
variations of F(x). In terms of P(x) and Q(x), we obviously have 


If =|" f(x) dF(x) =|” $00) aPC) —]” F00) dQ0. 


Remark 2. Analogous results can be deduced for arbitrary n, but they 
are less interesting because of the complexity of the relation between quasi- 
volumes and generating functions when n > | (see the remark on p. 65). 


Remark 3. Jordan’s theorem remains true if the adjective “generating” 
is omitted (twice). In fact, the class of generating functions of bounded 
variation (in the interval [a, 5}) is just the subset of the class of functions of 
bounded variation satisfying the extra condition F(a) = 0. Thus, suppose 
F(x) is a function of bounded variation such that F(a) 4 0. Then F(x) — F(a) 
is a generating function of bounded variation, and we need only add F(a) 
to P(x) or Q(x), depending on whether F(a) is positive or negative. 


86 


THE RIEMANN-GSTIELTJES INTEGRAL CHAP. 4 


PROBLEMS 


a 


1. Evaluate the following Stieltjes integrals: 


2 x? for 0< x <1, 
a) I, =| -xdF(x), where F(x) = ae ee ee 
0 for x = —l, 
b) J, =(° x dF(x), where F(x) = 1 for -—-1 <x <2, 
= —-] for 2<x< 3 


2 (The Cantor function). Consider the function C(x), 0< x =~ 1, defined as 
follows: At every point x of the Cantor set C (see Prob. 2, p. 21) with a ternary 
expansion x — 0.6,0,... (the numbers 0, take the values 0 or 2), the function 
C(x) has a binary expansion 0.0;6,..., where 0), — 0,/2. Then C(x) takes 
equal values at the end points of every interval [«, 8] adjacent to C (in the 
sense of Prob. 3, p. 22), and the definition of C(x) is completed by setting C(x) 
equal to the corresponding constant in the whole interval [«, 6]. The function 
C(x) is called the Cantor function. Show that C(x) is continuous. 


Hint. C(x) is nondecreasing, and its range is dense in (0, 1]. 


3. Show that the Cantor function C(x) defined in the preceding problem cannot 
be represented in the form 


Cx) =[* go) dx, (42) 
where g(x) is a summable function. 


Hint. Assuming that (42) holds, show that g(x) must vanish at almost 
every point of the complement of the Cantor set, thereby establishing a 
contradiction. 


Comment. The Cantor function is the generating function of a quasi-length 
which does not reduce to either of the types considered in Examples 2 and 3, 
p. 64. 


4. Show that the product of two functions F(x) and F(x) of bounded variation 
is also of bounded variation, where 
V°(F,F,) < max |F,(x)| V2(F,) + max |F,(x)| V2(F)). 


5. Let F(x) > a > 0 be a function of bounded variation. Show that 1/F(x) is 
also a function of bounded variation, where 


F 


6. A curve y = F(x), a< x < b is said to be rectifiable if the length of the 
‘inscribed polygonal curve’’ with consecutive vertices at the points (x9, F(x9)), 
(x, F(x,)), see 1g. (Xn_1 F(Xn_1)), (xn, F(xn)), where 


QA=X) <x, <'°°* <Xy < xX, = 4, 


b I 1 b 
V2) < 5 VF). 


PROBLEMS THE RIEMANNGSTIELTJES INTEGRAL 87 


is bounded by a fixed constant independent of n and the choice of the inter- 
mediate points x,,..., X,.,. Prove that the curve y = F(x) is rectifiable if and 
only if the function F(x) is of bounded variation. 


Hint. Use the inequality 
ysl < V (Ax, + Ay,)? < [Ax! + lAysl 


7. Show that the continuous function 
_ od 
x*sin 5 O< x< 132,68 >0) 


is of bounded variation if « > 8, but not if « < £. 


8. Let 4 be the space of all functions F(x) of bounded variation in the interval 
[a, b], where functions differing by a constant are regarded as equivalent. Show 
that 4 is a complete normed linear space, when equipped with the norm 


Fl = ViCF). 


5 


THE LEBESGUE-STIELTJES INTEGRAL 


5.1. Definition of the Lebesgue-Stieltjes Integral 


The natural next step is to construct a Lebesgue-Stieltjes integral which 
generalizes the Riemann-Stieltjes integral in the same way as the Lebesgue 
integral generalizes the ordinary Riemann integral. As shown in Chap. 3, the 
Lebesgue integral can be constructed starting from either step functions 
or continuous functions as the elementary functions, depending on our 
preference. However, these two constructions are no longer equivalent in 
the case of the Lebesgue-Stieltjes integral. More exactly, we can always 
construct the Lebesgue-Stieltjes integral starting from continuous functions 
as the elementary functions, with the Riemann-Stieltjes integral as the ele- 
mentary integral, but if step functions are chosen as the elementary functions, 
the construction is possible only when extra conditions are imposed on the 
original quasi-volume (see Sec. 5.8). For this reason, we prefer to begin 
with the first approach, using continuous functions as the elementary func- 
tions and the Riemann-Stieltjes integral as the elementary integral. 

Let o be a quasi-volume defined on a dense set of blocks Q in the basic 
block B. For the time being, we assume that o is nonnegative, 1.e., that 
o(B) > 0 for every block Be Q. Let H be the set of all functions continuous 
in B. Then A obviously satisfies the conditions on p. 23 for a family 
of elementary functions, with B playing the role of the set X. Moreover, if 
we define the elementary integral of any function fe H as the Riemann- 
Stieltjes integral 


If =|, f(x)o(dx), 
88 


SEC. 5,2 THE LEBESGUE-STIELTJES INTEGRAL 89 


then it is easy to see that /, f satisfies the axioms on p. 24 for an elementary 
integral. In fact, Axioms | and 2 are obvious, while Axiom 3 is an immediate 
consequence of the estimate 


ZoSnl =| Jy fn(do(dx) | < Va(a) max | f(%) 


[cf. formula (12), p. 68] and Dini’s lemma, just as in Sec. 3.4. 

We are now in a position to apply the general scheme of Chap. 2, obtaining 
first a class £z(B) of functions f which are (almost-everywhere) limits of 
nondecreasing sequences h,, € H with bounded Riemann-Stieltjes integrals," 
and then a class L,(B) of functions @ which are differences of functions in 
[z(B). The functions in L, are said to be o-summable or Lebesgue-Stieltjes 
integrable (with respect to the quasi-volume o), and the corresponding 
Lebesgue-Stieltjes integral is denoted by I, or 


[,e(s)(dx). 


Moreover, L, is a complete normed linear space, when equipped with the 
norm 


j lel. =T.(l¢l). 
In the case n = 1, the Lebesgue-Stieltjes integral over the closed interval 
[a, b] is denoted by I,¢, 
b 
[ecoota, 


OT 
[Po(x) dF), 


where F(x) is the generating function of the quasi-length o(a, (]. 


5.2. Examples 


Naturally, the character of £, depends on the quasi-volume o, as we now 
illustrate, using the same examples as in Sec. 4.2. 


Example 1. Let 
o(B) = s(B), 


where s(B) is the volume of the block B in the ordinary sense (and B is 
finite). Then obviously L, is just the class of Lebesgue-integrable functions. 


1 Here convergence almost everywhere is defined in the obvious way, i.e., aset Zo X 
is called a set of o-measure zero (relative to the elementary integral /) if given any € > 0, 
there exists a nondecreasing sequence of nonnegative functions h,(x)e€ H such that J, hy<e 
and sup /,(x) > 1 on Z, and a sequence is said to converge almost everywhere if it converges 
everywhere except on a set of o-measure zero. 


90 THE LEBESGUE-STIELTJES INTEGRAL CHAP. 5 


Example 2. Let 
o(B) =| , e(x) dx, 


where g(x) > 0 is a function summable over B, and let /:(x) be a function 
continuous in B. Then every Riemann-Stieltjes sum of A(x), with respect 
to the quasi-volume o, is of the form 


¥ HE)o(B,) = SHE, 802) dx 
= J, mE da(x) dx =f, hn(oe(x) a, 


where A(x) is the step function equal to A(¢,) in the block B;(j = 1,..., m). 
Since A(x) converges uniformly to the function A(x) as the partition is 
refined indefinitely, i.e., as a(I1) — 0, it follows from Lebesgue’s theorem 
(see Sec. 2.7) that 


Ih = h(x)o(dx) = Jim s h(é;)o(B,) =i) h(x)g(x) dx = I(hg), 
B a(Ik)>0 j=y “B 


where J is the ordinary Lebesgue integral. 

Now let 4, be a nondecreasing sequence of elementary functions which 
converges (everywhere) to a function f, and suppose the integrals 7,4,, and 
hence the Lebesgue integrals /(i,g¢), form a bounded sequence. Then the 
limit fg of the sequence /4,g¢ is summable in the ordinary sense, and 

I,f = lim /,h, = lim I(h,g) = I(fg). 
poo pro 
Such a function belongs to the class Lt by definition. Moreover, any function 
in L+ differs from a function of this type onty by a function fy which vanishes 
everywhere except on a set of o-measure zero. It follows that /(fog) — 0. In 
fact, let Z = {x: fo(x) A O}. Since Z is a set of o-measure zero, given any 
positive integer m, there exists a nondecreasing sequence of nonnegative 
continuous functions A!" (x) such that [,h°” <1/m and ue A(x) > | 
on Z. Moreover, it can be assumed that 


iat 63 h'™ (x) 


for arbitrary p and m. For fixed m, A” is a nondecreasing sequence converging 
to some limit A'")(x). Taking the limit as m -> co of the nonincreasing 
sequence fA!"(x), we obtain a function A(x) which is >! on Z. According 
to Levi’s theorem, 


I(h™ g) = lim I(hy” g) = lim T(hZ”), 
pro pro 
and hence for any m we have 


I(h'™ g) < i 
mM 


SEC. 5.2 THE LEBESGUE-STIELTJES INTEGRAL QI 


Therefore, by Levi’s theorem again, 


I(hg) = lim I(h"™ g) = 0. 

Therefore the function hg vanishes almost everywhere (relative to the ordi- 
nary Lebesgue integral /). But then fog also vanishes almost everywhere, 
since if f(x9)g(x%9) AO for some xX», then fo(x) #0 and g(x.) #0, Le., 
h(xo) > 1, g(xo) # 0 and h(xy)g(xo) % 0. Therefore /( fog) = 0, as asserted. 

Thus we have finally shown that if fe Lt, then the product fg is summable 
(in the ordinary sense) and 

I, f = I(fg). 


Taking differences, we see that if @ is o-summable (9 € L,), then the product 
og is summable, and moreover 


Io = (9g). (1) 


The convérse is also true, i.e., if the product gg is summable, then ¢ is o- 
summable. However, the proof of this assertion requires a deeper knowledge 
of measure theory, and hence will be postponed until Sec. 7.6. 


Example 3. Given a sequence of points c,,..., ¢,,...-1n the basic block 
B and a sequence of positive real numbers g),...,2,,... Such that 
(oe) 
> 8m < ©, 
m=1 


let o(B) equal the sum of all the g,, such that the corresponding points c,, 
belong to the block B. Moreover, let the function /i(x) be continuous in B, 
and let I] = {B,} be a partition of B. Then 


h |. h(x)a(dx) = lim > h(E,)o(B,) — lim Pe, >, 8k 


=D > Me )Bn + pa sup |h(¢;) — ie SB 


J=1 cy€B; M)>0 a Cre B; 
7el 
fe @) 
= > A( Cy) is (2) 
k=1 


since A(x) is uniformly continuous in B. 

Now let A, be a nondecreasing sequence of elementary (1.e., continuous) 
functions which converges to a function f at every point ¢,, and suppose the 
integrals [,, form a bounded sequence. Then, as we know, the limit function 
f belongs to the class C+. It can be assumed in this argument that the functions 
h, are nonnegative, since otherwise we need only replace, by 4, — h, and f 
by f — h,. Using (2), we have 

00 


[ hy = > hyley) Bn < C, 
k 1 


92 THE LEBESGUE-STIELTJES INTEGRAL CHAP. 5 
and hence 

h (Cy) x <S C 
for all p. It follows that 4,(c,) 7 f(c,) for every fixed A. Moreover, for every 
N, 


N 


= 2 h,(c,)&% 77 3s (Cx) Bes 


which implies 
le @) N 
> f(cyg, = lim > fl(adg, < C 
k= 1 Now k=1 


Finally, given any ¢ > 0, we have 


> fede <é 


for sufficiently large N. Therefore 


ea roe) N-1 N-1 
oD fC) 8 — D3 hile gn < P f(y) — > hylodse| 
k=1 k=1 k=1 k=1 


+ 2 S(c)8e < 2¢ 
k=N 


for sufficiently large p, and hence 


If = lim I,h, => fledee 


pron 


Thus we have shown that if f¢ L%, then the series 


D> Sc) 8x 

k=1 
converges and equals the integral /, f. It should be noted that the values of 
J(x) at points of the block B other than c¢, play no role at all, and in fact, 


J(x) need only be defined at the points c,,. Taking differences, we see that if 
© is defined at every point c, and if o € L,, then the series 


D> (cx) 8x 
k=1 
converges and equals the integral 1,9, and moreover 
> |19€¢x)1 8% < 00. (3) 
k-1 


Conversely, if (3) holds, then © belongs to L,. In fact, suppose 9 is nonzero 
at only one of the points c,, say ¢,. Then, if 9(c,) is positive, 9 is the limit of a 


SEC, 5.3 THE LEBESGUE-STIELTJES INTEGRAL 93 


nonincreasing sequence of continuous functions /,(x) equal to ¢(c,) at the 
point c,.2 Therefore, in this case, 9 € L, (in fact, o € Lt), and J, is just 

o(c,)g,. If (¢,) 1 is negative, the same argument can be used to show that —o 
belongs to Lz, so that » again belongs to L,, with the same integral as before. 
In the general case, 9 1S a sum of functions each “‘concentrated”’ at one of 
the points c,, where the sum of the corresponding integrals is absolutely 
convergent, by hypothesis. Therefore ¢ belongs to L,, by an obvious version 
of Levi’s theorem. Our description of L, is now complete. 


Remark. In each of the above examples, the Lebesgue-Stieltjes integral 
of ~ turns out to be a numerical series or the ordinary Lebesgue integral of 
the product of » with some function. In the general case, the Lebesgue- 
Stieltjes integral has a more complicated structure (cf. Prob. 3, p. 86). 


5.3. The Lebesgue-Stieltjes Integral with Respect 
to a Signed Quasi-Volume 


As we know from Sec. 4.7, a signed quasi-volume o (of bounded variation) 
can be represented as the difference between two nonnegative quasi-volumes p 
and g, called the positive and negative variations of o. Let v be the quasi- 
volume p rq, 1.e., the total variation of o. Then, using the nonnegative 
quasi-volumes v, p and q, we can construct corresponding spaces [,,L,and L, 
of summable functions. Every function ¢ € L, also belongs to the spaces L, 
and L,, as we shall prove in a moment. Thus, if p €L,, the integrals /,9, 
I, and [,¢ all exist, where, as is easily verified, 


Le = 1,9 + he 
This suggests the following 


DEFINITION. Ifo eL,, the Lebesgue-Stieltjes integral of » is given by 
Io =1,9 — I. (4) 


For continuous 9, the expression (4) coincides with the Riemann- 
Stieltjes integral of ¢ with respect to o. Moreover, as shown in Sec. 
4.7.2, the canonical representation o — p — qg is the “most econom- 
ical,” in the sense that py > p,g, > 4,0; = Pi + gq, > v for any other 


2In the one-dimensional case, choose the functions 


o(c, {1 — p|x — el} for |x —c,| < 1/p,x EB, 
h(x) = 


0 otherwise. 


In the n-dimensional case, represent /,(x) as a product of » such functions, each depending 
on one coordinate. 


94 THE LEBESGUE-STIELTJES INTEGRAL CHAP. 5 


representation. Therefore the space L, (corresponding to the canonical 
representation) is the largest possible (L,, < L,). 

We now establish the missing step, as promised. Let p and v be any 
two nonnegative quasi-volumes, defined on some dense set of blocks Q 
in B, such that p(B) < v(B)forevery Be Q. Then L, < L,, and moreover 


F(lel) < Fel) (5) 


for every 9 € L,. For continuous 9, this is obvious, since then 


Lol) =f, lool (dx) < J, le) (dx) = LU). 


More generally, suppose ¢ is the limit (everywhere) of a nonincreasing 
sequence of continuous functions /,,, with bounded integrals /,A,,. Then 
the integrals /,h,, are also bounded, so that the function ¢ belongs to L*. 
However, ¢ is still not sufficient for our purposes, since the general func- 
tion in L+ differs from a function like @ by a function @ vanishing almost 
everywhere relative to /,. But @ also vanishes almost everywhere relative 
tol. In fact, given any ¢ > 0, there exists a nondecreasing sequence of 
nonnegative continuous functions h,,(x) such that /,A,,<e¢ and 
sup h,,(x) > 1 on the set Z — {x: p(x) 4 0}. Therefore, since /,h,, < 
Ih, the set Z is also of measure zero relative to /,. It follows that 9 
belongs to L,, and hence so does the general function in L*+. Moreover, 
any function ) EL, belongs to L,, being the difference between two 
functions in Lt, ie., L,< L,, as required. Finally, to see that the 
inequality (5), valid for continuous ¢, continues to hold for every 9 € L,, 
we need only take the limit with respect to the ,-norm, noting that the 
set of elementary functions is dense in the space of summable functions 
(cf. Sec. 2.9). 


5.4. The General Continuous Linear Functional 
on the Space C(B) 


Given a (signed) quasi-volume o of bounded variation, we can form 
the Riemann-Stieltjes integral 


If =|, f(xo(dx) 


of any function f(x) continuous in the basic block B. Let C(B) be the normed 
linear space of all functions continuous in B, equipped with the norm 


fll = max |f(x)]. 
reB 


SEC. 5.4 THE LEBESGUE-STIELTJES INTEGRAL 95 


Then the integral /, f defines a continuous linear functional on C(B), since 
it satisfies the following two conditions: 


1) If fi, fe are any two functions in C(B) and o, a, are any two real 
numbers, then 


Taft ss OX fo) = aloft ae al, fe. 


2) If f,, € C(B) is a sequence such that || f,, || -- 0 as m — oo, then /, f,, > 0, 
as follows at once from the estimate 


afal =| {5 fn(xo(dx) |< Wl Ya(o) 
(cf. p. 89). 


Thus every Riemann-Stieltjes integral gives rise to a continuous linear 
functional on C(B). We now prove the converse: 


THEGREM 1. Given a continuous linear functional If defined on the 
space C(B), there exists a quasi-volume o — o(1) of bounded variation 
such that 


If =| f(x)e(dx) (6) 
for every f € C(B). 


Proof. First suppose J is nonnegative, so that Jf > 0 if f(x) > 0. 
Then, choosing C(B) as the space of elementary functions and the 
functional J as the elementary integral, we can construct a space L, 
of /-summable functions. The only nontrivial part of this assertion, given 
the theory of Chap. 2, is to verify that J satisfies Axiom 3, p. 24. But, 
according to Dini’s lemma (p. 54), f,, 0 implies f,,-> 0 uniformly, 
i.e., Il fl] ~ 0, and hence /f,, + 0. In particular, L, contains the charac- 
teristic function 

l for xeB, 
XB(%) = 


0 for x¢€B 


of every block B < B. In fact, ¥,(x) can be represented (in various ways) 
as the limit of an everywhere convergent sequence of continuous 
functions f,,(x), where the functions /,,(x) can be chosen to be nonnegative 
and bounded by | (the construction resembles that given in footnote 2, 
p. 93). Therefore 

If, < T(1), 


and hence, by Lebesgue’s theorem (see Sec. 2.7), 4, € L; and 


ly pz = lim oe 


mos 


o(B) = Ixp, (7) 


Defining 


96 THE LEBESGUE-STIELTJES INTEGRAL CHAP. 5 


we see that o(B) is bounded, since c(B) < o(B) — /(1), and obviously 
represents a nonnegative quasi-volume defined on every block BC B. 
Moreover, the quasi-volume (7) satisfies the relation (6). To see this, 
note that the Riemann-Stieltjes integral in the right-hand side of (6) 1s 
the limit of the sum 


m m mn 


2 S(63)9(B;) = 2 SSde; =I | ¥ Eden(| = Ihy(x) 


as the partition Il == {B,} is refined indefinitely, where A,(x) is the step 
function equal to f(<;) in the block B, (j= 1,...,m). As d(Il) > 0, 
ho(x) converges uniformly to f(x), and hence, by Lebesgue’s theorem 
again, 

lim hn(x) = If, 


a(il)70 
as asserted. 


Finally, if the functional / takes values of either sign, then, according 
to Riesz’s representation theorem (p. 44), we can represent / in the form 


l=J—N, 


where the linear functionals J and NM are nonnegative and continuous 
in the sense that /,, ‘. 0 implies J/,, ->0, NA,, > 0. Let p and g be the 
nonnegative quasi-volumes corresponding to J and JN, in accordance 
with the above construction. Then, given any function fe C(B), 


if If NF | feoptdsy ~[, Foogdy) = 1, fedatd), 


and /fcan once again be written in the form (6), where a is a quasi-volume 
of bounded variation. This completes the proof. 


Remark. Thus, without knowing it at the time, in carrying out the 
constructions of Secs. 5.1 and 5.3, we were actually using the most general 
continuous linear functional as the elementary integral! 


5.5. Relation between the Quasi-Volumes o and o 


Suppose the functional /f figuring in Theorem 1 is itself a Riemann- 
Stieltjes integral, with respect to a quasi-volume o defined on a dense set 
of blocks Q in the basic block 


B= {Xi a; %)'< Dyas 0n = 2, < 5}. 


Then it might be expected at first that 


o(B) = IoxXp 


SEC. 5.5 


THE LEBESGUE-STIELTJES INTEGRAL 97 
would coincide with o(B) for every Be Q. Nevertheless, in general this is 
not the case (see Prob. 7, p. 109), and it can only be said that 


Qe 


= 6, (8) 
i.e., that repetition of the process leading from o to o gives nothing new. 
In fact, if the sequence f,, has the same meaning as on p. 95, then obviously 


m7 © 


6(B) = lim I f,, = lim I,f,, = 6(B), 


as asserted. Note, however, that o(B) always coincides with o(B), since 
FurbX) = I [1.e., Fil) 


- | for all x in the basic block B] is a sequence of 
continuous functions converging (trivially) to y,, and obviously in this case 


; | ie =|, o(dx) = o(B). Oils Deuce a) 


To find a direct connection between o and co, we must first introduce 
some new concepts. The block 


B, = {xr ah” <x, < By... , a <x, < BY} 
is said to be strictly included in the block 
By Sh x80l XS 6 cic See 8 
(symbolically B, © B, or B,D B,) if 
1) B, < By in the usual sense, 1.e., 
(2) (1) (1) (2) 
i OS eB eG, ee 
2) For all k, 


., Nn); 
a, < oy if a ys 
(1) (2) é (2) 

ne < Px if Bo # b,, 


but if «'?) = a,, the relation «!}) = a!?) = a, is permitted, and sim- 
ilarly for the “upper coordinates” 8B)" and Bi??. 


If B, © By, there exists a continuous function f(x) in B, taking values 
between 0 and |, such that 


(x) l for xeEB,, 9) 
xXx) = 
J 0 for x€B,. 


For the case n — 1, this is obviously true for a suitable “trapezoidal function” 
f(x), and for arbitrary n, we need only form a product of » such functions, 


98 THE LEBESGUE-STIELTJES INTEGRAL CHAP. 5 


each depending on one coordinate (cf. footnote 2, p. 93). If B, © B,, then 
it follows at once from the definition of the Riemann-Stieltjes integral that 


o(B,) < |, f(x)o(dx) < o(B,) 
for the function (9) and any nonnegative quasi-volume o. 
We still need another definition. The sequence of blocks 


(m) 


(m) 
no he Pa 


Br = {xi 0k™ <x, < B™,...,0 
is said to converge downward to the block 
BE = ee Cie = k= By) 
(symbolically B,, \ B), if for all k, a” \ a, BI” \ B,, where 
Oty, < ay” < B, < BY” If a, A ay, By F d,, 
Oy, = 0" < B, < BL” if %, = a, 
Cae Sf 6 af 16. b,. 
We are now in a position to find the relation between o and o. Given 


a sequence of blocks B,, *. B, we construct two other sequences Bi, \ B 
and B” \. Bsuch that B’ D> B,, D BY. Let f,,(x) and g,,(x) be two sequences 


rN m 


of continuous functions, taking values between 0 and I, such that 


I for xe B), 
Ini) = ; for x¢By,, 
| for x eB, 
nlx) = , for x ¢B’,. 
Then 
Jy Sn(xdo(dx) < o(Bp) <[, Bn(x)o(dx) (10) 


for any nonnegative quasi-volume c. But as m ~ &, both f,,(x) and g,,(x) 
converge to the characteristic function of the block B, and hence 


6(B) = lim L,f,(x) = lim hg,,(x). (11) 


m7 © mM 


Comparing (10) and (11), we see at once that the limit 


o(B) = lim o(B,,) (12) 
is uniquely defined for any nonnegative quasi-volume o and any sequence 
of blocks B,, € Q(c) converging downward to the (arbitrary) block B ¢ B. 

Formula (12) gives the desired relation between the quasi-volume co 
constructed via Theorem | and the original quasi-volume a, for the case 


SEC. 5.6 THE LEBESGUE-STIELTJES INTEGRAL 99 


where o is nonnegative. In the general case, where o is a signed quasi- 

volume (of bounded variation), we represent o as a difference between two 

nonnegative quasi-volumes p and q, defined on the same dense set of blocks 

Q(c), as in Sec. 4.7.1. Then, according to Sec. 5.3, the integral J, is defined 

as the difference between the nonnegative integrals /, and /,. It follows that 
6(B) = yg = lim I,f,, = lim I fn — lim 1, fn 


mas 


= p(B) — GB) = lim p(B,,) — lim q(B,,) = lim o(B,,), 


moo m— oo mo 


i.e., formula (12) continues to hold. 


Remark 1. \t should be emphasized that since o is of bounded variation, 


So is o, and - 
[, fo(dx) =[,, f(xya(dx) 
for every function f(x) continuous in B (by the very construction of o). 


Remark 2. \n the case n = 1, the above results take the following form: 
Given any generating function of bounded variation defined on a dense set 
E c fa, b] (containing a and 5), the right-hand limit 

F(x) = lim F(x + hy) 


athe kK 
ho 


exists for every point x € (a, bj. This is hardly surprising, since, according 
to Jordan’s theorem (p. 85), F(x) can be represented as the difference 
between two nondecreasing functions, for which the existence of F(x) is 
obvious, even at the point x -a. Moreover, the function F(x) is itself of 
bounded variation, and the relation 


[° £00 aF@) =|? £09 4FO) 


holds for every function f(x) continuous in [a, 5]. 


5.6. Continuous Quasi-Volumes 


A quasi-volume o (in general, signed), defined on all blocks B © B, is 
said to be continuous (more exactly, upper continuous) if it satisfies the 
condition 

o(B) = hm o(B,,) (13) 
for every block B < B and every sequence B,, \ B. In particular, the quasi- 
volume o(B) defined by formula (6) 1s continuous. In fact, as already noted, 
o(B) is defined for all blocks B © B, and moreover, substituting o =o 
into (12) and taking account of (8), we find that (13) holds with o — c. 


100 THE LEBESGUE-STIELTJES INTEGRAL CHAP. 5 


Another important notion is that of continuity on the empty set. A quasi- 
volume o (defined on all blocks B < B) is said to be continuous on the empty 
set if 

lim o(B,,) = 9 


moo 


for every sequence B, > B, > --- with an empty intersection 
ol 
MB 
m=1 
LemMA. Ifa signed quasi-volume o of bounded variation is continuous 


on the empty Set, then so are its positive, negative and total variations. 


Proof. Let o = p — q, where p and q are the positive and negative 
variations of o, and suppose the intersection of the sequence of blocks 
B, > B, > ++: is empty. Then, as we now show, 

lim p(B,,) = 9. (14) 


Suppose (14) is not true. Then 


P(B.,) > € 
for some ¢ > 0 and all m. Choose a subsequence B,, of the sequence 
B,, according to the following inductive rule, starting from B,, = B,: 
Given By a Met Bie ecciars By be a set of disjoint subblocks of B,, _ such 
that 
Tk 
> o(BY) > = (15) 


jul 


[recall the definition of p(B) on p. 81]. Since the sequence of blocks B,, 

has an empty intersection, the same is true of every sequence B',)B,, 

(for fixed j and &). But then, since o(B) is continuous on the empty set, 
"k 


> o( BY Bm) < ; 


te | 


for sufficiently large m = m,,,, and hence 


Tr 

: (3) (5) 4 
> o(Bn, — Bn, Bm,,1) = 3 5) 
j=1 


where o( BY) — BY’ B,,_,) is shorthand for the sum of the quasi-volumes 
of any disjoint set of blocks with union By) — By'B,,,_, (such a set exists 
by Property 2, p. 62). This determines the next block By, in the subse- 
quence, which in turn leads to a new set of blocks Bu “satisfying (15) 


with A replaced by A - 1. By construction, none of the blocks whose 


SEC. 5.6 THE LEBESGUE-STIELTJES INTEGRAL 10] 


union is By}) — BiB, intersects any of the blocks whose union is 
Bi — BP Bm, _ifl k A T. It follows that 


Ne 
(Bat, — Bri, Bm,,1) > a 


1 TMs 


i.e., the left-hand side can be made as large as we please by choosing N 
large enough. But this contradicts the assumption that the quasi-volume 
o is of bounded variation, and therefore (14) must hold. Thus, if o is 
continuous on the empty set, so is p, and hence so are g = p — o and 
v= p-+ q, as asserted. 


The relation between (upper) continuity and continuity on the empty 
set is revealed by 


THEOREM 2. A quasi-volume o of bounded variation is continuous if 
and only if it is continuous on the empty set. 


Proof. Ifo 1s continuous, then 
o(B) = 6(B) = I, yp. (16) 


Given any sequence of blocks B, > B, > :: + with an empty intersection, 
we set B = B,, in (16). Then, noting that yz ‘s 0 and applying Levi’s 
theorem, we find that o(B,,) — 0, 1.e., o is continuous on the empty set. 

To prove the converse, we first assume that o is nonnegative. Let 
B,, “ Band write 


m 


BS ke ey a Catt = ye By), 
«a (m) (m) (m) (m) 
By = {XO SX, < By pore yg Hy <Xp B hs 
(kK) __ : (m) Rk) __ : (m) 
Br = {x1 ay < Xe < %"}, Bn = {x! By < %% < Be 


(kK =1,...,n). 
Then, clearly 


fo @) 
BY BY ess, NBY =e 


m—l 


ro0) 
BO D> BHD... , NB = g 
m—I 


and hence, by hypothesis, 
lim o(B;"’) = lim o(B“’) = 0. 


mm © mo 
Moreover, it is easy to see that 


Bc B_UVU BYU ee By, B,, CBU BOM °U Be 


me 


102 THE LEBESGUE-STIELTJES INTEGRAL CHAP. 5 


and hence 
o(B) < o(B,,) + ¥ o(B®), (17) 
k=1 
o(B,,) < o(B) +> o( B®), (18) 


But (17) and (18) imply 
o(B) < lim o(B,,) 


and a 
lim o(B,,) < o(B), 


m— a 


respectively, which in turn imply 


o(B) = him o(B,,), 
m+: 
i.e., the quasi-volume o is continuous. 

Finally, we remove the restriction that o be nonnegative. Suppose s 
is signed and continuous on the empty set, with representation o = 
p — gq in terms of the positive and negative variations p and q. By the 
lemma, p and gq are themselves continuous on the empty set. But then p 
and q are continuous, by the argument just given, and hence so is their 
difference o = p — q. This completes the proof. 


Remark |. Clearly, the lemma remains valid if we omit the phrase “on 
the empty set.” 


Remark 2. In the case n = 1, the above results take the following form: 
If the quasi-length o 1s continuous, then its generating function F(x) = 
ola, x] 1s defined for every x € [a,b] (recall that F(a) = 0 by definition), 
and moreover 


F(x 9) == o[a, Xo] = lim o[a, x] = lim F(x) = F(x, + 0) (19) 

xr 20 x2 20 
if Xo > a, Le., F(x) is continuous from the right in (a, 6). Conversely, if 
F(x) is defined in [a, 5] and continuous from the right in (a, b}, then (19) 
shows that o 1s (upper) continuous in every interval [a, Xo], X9 > a. But o 
is also continuous in every other interval (x9, 8)] < [a, 5], since a 4 %, 


Bs By implies 
a(x, 8] = F(R) — F(a) > F(B) — F(%) = o(%, Bol. 


As for the exceptional point x = a({a, a] is not a block!), although F(a + 0) 
always exists, there is no reason for it to coincide with F(a) — 0. 


SEC. 5.7 THE LEBESGUE-STIELTJES INTEGRAL 103 


Next, as promised in the remark on p. 84, we establish the connection 
between the canonical representation of functionals given in Sec. 2.11 and 
the canonical representation of quasi-volumes given in Sec. 4.7: 


THEOREM 3. Let o be a continuous signed quasi-volume of bounded 
variation, with canonical representation o = p — q, and let I, be the corre- 
sponding functional on the space C(B),*? with canonical representation 
I, == J — N. Then the two representations are consistent, in the sense that 


J=I, N=I.,. (20) 


Proof. According to Theorem 1, the functionals J and WN are 
Riemann-Stieltjes integrals with respect to nonnegative quasi-volumes 
P; and g,, which are automatically continuous. Therefore the canonical 
representation of /, can be written in the form 


4 15: id gs (21) 
On the other hand, it is obvious that 
L,=1,—,. 
Therefore 
I,f > fpf 


for every f € C(B), because of the basic minimal property of the canonical 
representation (21). But this implies 


P(B) > pi( 8), (22) 


since, according to Remark |, p. 102, the quasi-volume p is continuous. 
Moreover, according to (21) again, we have o(B) = p,(B) — q,(B), and 
hence 

Px(B) > p(B), (23) 


by the basic minimal property of the canonical representation o = p — q. 


Comparing (22) and (23), we obtain p(B) — p,(B) and hence g(B) = 
qi(B). Since J = I, , N = I, this implies (20), as required. 


5.7. Equivalent Quasi-Volumes 


As shown in Sec. 4.4, starting from any quasi-volume o defined on a 
dense set Q of blocks B < B, we can construct the Riemann-Stieltjes integral 


If ={, f(x)o(dx) 


3 T.e., the Riemann-Stieltjes integral with respect to the quasi-volume oc. 


104 THE LEBESGUE-STIELTJES INTEGRAL CHAP. 5 


of any function f(x) continuous in B. In so doing, we do not exclude the 
possibility that different quasi-volumes o, and o, may lead to identical 
values of the integrals /, fand /,, f, and hence (for nonnegative quasi-volumes) 
to identical spaces L, and L,, of Lebesgue-Stieltjes integrable functions. 
As in Sec. 4.4.4, such quasi-volumes are called equivalent. For example, 
given a quasi-volume o of bounded variation, we can use Theorem | to 
construct an equivalent confinuous quasi-volume o. Moreover, if two equiva- 
lent quasi-volumes are continuous, they must coincide. In fact, if B,,  B, 
then, according to Secs. 5.5 and 5.6, 


o,(B) = lim o,(B,,) = 6,(B), — o,(B) = lim 6 (B,,) = 52(B,,); 
m— co m+ oc 
where o, and o, coincide because of the equivalence of o, and oy. In other 
words, the class of all quasi-volumes equivalent to a given quasi-volume 
contains a unique continuous quasi-volume. 
It follows at once from the above considerations that two quasi-volumes 

6, and o, are equivalent if and only if BY, \s B, BY \ B, Bi, € Q(o,), 
B", € Q(62) implies 

lim o,(B,,) = lim o,(B;,). 

mrx m~ oO 


For nonnegative quasi-volumes, we can say even more: 


THEOREM 4. Two nonnegative quasi-volumes o, and 6, are equivalent 
if and only if B’ © B" implies 


o,(B’) < o,(B") if B E Q,, B* E Q,, 
o,(B’) < o,(B") if Bee Q,, Be Qy. 


Proof. Suppose o, and o, are equivalent. Choosing blocks B’ € Q,, 
B" € Q,, B’ © B", let f(x) be a continuous function, taking values between 
0 and 1, such that 


(24) 


| for xeEB’, 


7 f for x¢B’. 


Then, obviously, 


0,(B’) <[, f(xox(dx) =|, f(xox(dx) < 0,(B"), 
as required. 

Conversely, suppose (24) holds for two quasi-volumes o, and oy. 
Then, starting from the Riemann-Stieltjes integrals, we can use Theorem 
1 to construct the corresponding quasi-volumes o, and o,. According 
to Sec. 5.5, given any block BC B, 


6,(B) = lim o,(B;,’), 6,(B) = lim o,(B,,.), 


m 
MnL> DH mane 


SEC. 5.8 THE LEBESGUE-STIELTJES INTEGRAL 105 


where B') € Q, and B'?) € Q, are any two sequences of blocks converging 


downward to the block B. Given a sequence BY) \. B, we can always 
find another sequence B‘?) \. B such that 


BY € BY (mi 1,2, «2-s); 


Then, by hypothesis, 
6,(Byn’) < o,(By’), 
and hence 
6,(B) = lim o,(BY’) < lim o,(BY)’) = &,(B). (25) 
On the other hand, we could just as well have found a sequence BY” \. B 
satisfying the opposite inclusion relation 


B®) 5 BY 
and then, 
6,(B) = lim 6,(B,,’) > lim 0,(B;,’) = 6,(B), (26) 


since 6o(B) does not depend on the choice of the sequence B'”’. Together, 
(25) and (26) imply o,(8) = o,(B) for any block B. But since o, is equiva- 
lent to o, and o, is equivalent to oy, it follows that o, is equivalent to os, 
as asserted. 


Remark. In the case n = 1, the above results take the following form: 
Let o,(«, 8] and o,(a, 8] be two quasi-lengths of bounded variation, with 
corresponding generating functions F(x) and G(x), defined on dense sets Fy 
and Ey (in [a, b]). Then o, and oy are equivalent if and only if F(x) = G(x) 
for every x € [a, b], where F and G are defined as in’ Remark 2, p. 99. 
Moreover, if o, and co, are nonnegative, they are equivalent if and only if 
x’ < x" implies 
F(x") if x €E,, x’ €£,, 


Fix’) < 
‘) < Fix’) if x’ € E,, x" € Ej. 


F(x 


5.8. Construction of the Lebesgue-Stieltjes Integral 
with Step Functions as Elementary Functions 


Suppose we want to construct the Lebesgue-Stieltjes integral with respect 
to a given quasi-volume oc, choosing the elementary functions to be step 
functions (as in Sec. 3.1, for the ordinary Lebesgue integral). Then the 
natural choice of the elementary integral is 


m 


Ih => hjo(B;), By = {x: h(x) = hy}, (27) 


which differs from formula (1), p. 50 only by having o in place of s (as 


{06 THE LEBESGUE-STIELTJES INTEGRAL CHAP. 5 


usual, {B,} is a partition of the basic block B). We begin by assuming that 
o iS nonnegative, removing this restriction at the end of the section. Before 
we can apply the general Daniell scheme of Chap. 2 (with H the family 
of step functions defined on the set X __B), it must be verified that /, satisfies 
all the axioms for an elementary integral given in Sec. 2.1. Axioms | and 2 
are obvious, but we must be careful about Axiom 3, which asserts that if 
h(x) >. 0, then J,A,, > 0. In fact, it turns out that Axiom 3 is not satisfied 
unless an extra condition is imposed on the quasi-volume o: 


> 


THEOREM 5. The proposed elementary integral I, satisfies Axiom 3 
if and only if the quasi-volume o is continuous. 


Proof. Let B be any block and B,, \. B any sequence of blocks 
converging downward to B. If /, satisfies Axiom 3 (besides Axioms | 
and 2), the Daniell construction will lead to an integral satisfying 
Lebesgue’s theorem. But then, since the sequence of characteristic 
functions 7; (x) obviously converges (everywhere) to the characteristic 
function y¥,(x), we have 

o(B) = [5x8 = lim I, xz,, = lim o(B,,), 
m~> 0O m- oc 
and hence o is continuous. 

Conversely, suppose o is continuous. It would be difficult to verify 
Axiom 3 directly, as in Sec. 1.5, since the functions /,,(x) and the quasi- 
volume o may share “sheets of discontinuity.”* However, an indirect 
proof, involving the integral !, and the space L,, constructed in Sec. 5.1, 
can easily be given. In fact, let 


Am(x) = Be XBs"(%) 
7 


be a nonincreasing sequence of step functions converging to zero. Then, 
since L, contains all step functions (cf. p. 95), and since o is continuous 


*A sheet S$ is said to be a sheet of discontinuity of the (nonnegative) quasi-volume o if 
there exists a sequence of blocks B; > B, > - ++ whose intersection 


- 
Bf) Bn 
m-1 
is contained in S, such that o(B) 0. There is nothing to prevent an (upper) continuous 
quasi-volume s from having sheets of discontinuity. For example, suppose o Is the con- 
tinuous quasi-length with generating function 
for Oc x: 1, 


F(x) = 
@) 1 for Il <x 2. 


Then the point v — 1 is a sheet of discontinuity (1.e., a discontinuity point) of o. 


PROBLEMS THE LEBESGUE-STIELTJES INTEGRAL 107 


(o = a), we have 


Iohm = > h™o(B™) => ho S(BO™) = Thy. 
j=1 j=1 


But Levi's theorem holds in the space i and hence [hy > 9, 1.€., 
I,h,, — 0, as required. 


Thus all the prerequisites for constructing a theory of the integral, based 
on step functions as elementary functions, with the integral (27) as elementary 
integra], are satisfied. Let L, — L,(B) denote the corresponding space of 
o-summable functions, equipped with a Lebesgue-Stieltjes integral /, f, Then, 
as we now show, this construction of the Lebesgue-Stieltjes integral agrees 
with that of Sec. 5.1, based on continuous functions as elementary functions 
with the Riemann-Stieltjes integral as elementary integral, leading to a space 
L, of s-summable functions equipped with a Lebesgue-Stieltjes integral J, f. 
More precisely, we prove the following 


THEOREM 6. The two constructions of the Lebesgue-Stieltjes integral 
are equivalent, i.e., L, = L, andI,f = I,f. 

Proof. As might be expected, the proof is word for word the same 
as that of the theorem on p. 54, if we make the following substitutions: 

LoL, LoL, I-11, [-I,, s(B)—>0o(B,), dx— o(dx). 


Finally, as promised, we remove the restriction that o be nonnegative. 
Let o be any continuous signed quasi-volume of bounded variation, with 
canonical representation o —- p — q and total variation v = p +q. Then, 
according to Sec. 5.3, the spaces L, and L, figuring in Theorem 6 should 
be replaced by L, and L,, while the integrals /, and /, take the form 


i= of | i 


q? 


(Note that Theorem 3 guarantees the same integral, whether we follow the 
procedure of Sec. 2.11.2 or that of Sec. 5.3.) But, according to Theorem 6, 
L,—L,0€,=L,0, <b andl, f—1,f.l,f = 1,f. \t follows that /, f= 1,f, 
as required. 


PROBLEMS 


1. Let s(B) be ordinary volume, and let o(B) be the quasi-volume defined by 


I for x EB, 
o(B) = 
0 for x €¢B, 


where x, is a fixed point and B is finite. Find a function summable with 
respect to o(B), but not with respect to s(B). 


108 THE LEBESGUE-STIELTJES INTEGRAL CHAP. 5 


Ans. For example, 
0 for x = Xp, 
9(x) = I 
ieee for x # Xp. 


2. Let s(B) and o(B) be the same as in the preceding problem. Find a function 
summable with respect to s(B), but not with respect to o(B). 
Ans. For example, 


00 for x = Xp, 


9(x) = 
0 for x # Xp. 


3. Let o(x, 6] be the quasi-length defined on the closed interval [0, 1] by the 
formula 


cS tne 
o(a, B =}, in ax. 
Find a function 9(x) which is Lebesgue-integrable (in the ordinary sense) but 


not s-summable, and a function ¢(x) which is s-summable but not Lebesgue- 
integrable. 


Ans. For example, 


9(x) = v(x) = 


x In? x’ l-—-x- 
4. Evaluate the Stieltjes integral 
3 
1 —|" (x) dF), 
where 
—| for O0O<x<l, 
| for O<x<l, 
o(x) = F(x) = 2 for 1<x <2, 
0 for 1<x <3, 9 fare ese G 
Ans. I = 1. 
5. Evaluate the Stieltjes integral 
"1 
I =| ox) dF(x), 
where 
1 for O0<x <i, 0 for 0<x <i, 
o(x) = F(x) = 
for }<x<l, for §}<x<1 
Ans. I = 1(). 


6. Suppose one tries to construct a theory of Stieltjes integration in which the 
basic block B and ail its subblocks are half-open sets of the form 


(ee ey SS Bian ye Bah. 


PROBLEMS THE LEBESGUE-STIELTJES INTEGRAL 109 
Show that the analogue of Theorem 1, p. 95 fails to hold for the space C(B) of 
all functions uniformly continuous in B. 


Hint. Let B = (0, 1], o(0, 8] = 1, o(«, 6] = 0 if « > 0, and consider the 
sequence 


l 
| for O<x <-, 
m 
2 1 2 
Sinlx) = m\— — x for ee ee 
2 
0 for —<x<]1 
m 


of functions uniformly continuous in B. Then f,,(x)*, 0 for all x €B, but 
I, fm = 1 for all m. 


7. Let 6(«, 3] be the quasi-length defined on the closed interval [0, 2] by the 
formula 
I if Xg = 1is an interior point of («, 8), 


o(a, 8] = 


0 otherwise. 
Show that if B = [0, 1], then o(B) = 0 but 5(B) = 1. 


8. There is a proof of Theorem I, p. 95 based on the Hahn-Banach theorem, 
according to which a linear functional can be extended from a given normed 
linear space R to any larger normed linear space R’ > R. In fact, the Hahn- 
Banach theorem is used to extend the given linear functional // from its orig- 
inal domain C(B) to the space S(B) of all bounded functions f(x) with the same 
norm 

sup | f(x)| . 

reB 
The values of the extended functional on characteristic functions of blocks 
defines a quasi-volume o(B) of bounded variation, which is in general not 
continuous. The quasi-volume o(B) can then be used to define a Riemann- 
Stieltjes integral which coincides with /f on continuous functions. Going from 
o(B) to 5(B) as in Sec. 5.5 makes the quasi-volume continuous without changing 
Riemann-Stieltjes integrals of continuous functions. Carry out this construction 
(due to L. A. Lusternik) in detail. 


Part 3 


MEASURE 


6 


MEASURABLE SETS 
AND GENERAL MEASURE THEORY 


In constructing a theory of the integral, our first step was to define the 
volume (or quasi-volume) of certain “elementary figures,” namely blocks. 
We now use our fully developed theory of the integral to construct the volume 
(or quasi-volume) of “nonelementary figures,” i.e., sets of a more or less 
general nature. As might be expected, the “volume” of a figure M should 
be defined as the integral of its characteristic function xy ,(x), equal to 1 
on M and 0 outside M. This approach leads to certain difficulties, which 
can be circumvented in a way that becomes more transparent if we adopt 
an abstract point of view. 


6.1. More on Measurable Functions 


As in Sec. 2.1 we start from a family H of elementary functions, defined 
on an abstract set X, which satisfy Axioms a and b on p. 23. One of our 
main concerns in this chapter will be the class of measurable functions, 
already encountered incidentally in Part 1. According io Sec. 2.8.1, a function 
f(x), defined on a set X, 1s called measurable if it is finite almost everywhere 
and can be represented on a set of full measure as the limit of a convergent 
sequence of elementary functions. In the first instance, this definition implies 
that any function differing from a measurable function only on a set of 
measure zero is itself measurable. The following are some further conse- 
quences of the definition of measurability: 


1) If fand g are measurable, so that f = limA,, on a set of full measure E 
and g = limk, ona set of full measure F, then. given any real numbers 


113 


114. MEASURABLE SETS AND GENERAL MEASURE THEORY CHAP. 6 


o and 8, we have af |- 8g == lim (ah, + 8k,) ona set of full measure 
EF (the intersection of E and F), and hence af -|- Bg is also measurable. 


2) If f = limA,, is measurable, so is | f| = lim |/,]|. It follows that if f 
is measurable, so are f" and f , and hence, if f and g are measurable, 
so are 


max(f,g) =(f—g)* + 2%, min (f, g) = —max (—f, —g). 


3) Every function f € L' is measurable, being the limit of a sequence of 
elementary functions (in fact, a nondecreasing sequence). Moreover, 
every summable function » € L is measurable, since ¢ = f — g where 


feel. 


THEOREM |. Jf f,, is a nondecreasing sequence of summable functions 
converging almost everywhere to a finite limit f, then f is measurable. 


Proof. The proof bears a resemblance to those of Theorems | and 2 
of Chap. 2. First suppose that f, <¢ L', and let #,, (k = 1,2,...) bea 
sequence of elementary functions such thath,, 7 f, as k — oo. If 


h, = max (hy,,...- 


» Ann); 


then 4,(x) is a (nondecreasing) sequence of elementary functions, and 
hence has a finite or infinite limit f*(x) for every x € X. For n > k, we 


a hen < In < fo 


Taking the limit of this inequality as n — oo, we obtain 


=f” =, 


which shows that f*, like f, is finite almost everywhere. Therefore f* is 
measurable. But f* = f almost everywhere, since f;, 7 f, and hence f 
is measurable. 

Now let the f,, be arbitrary summable functions. We have 


Le 
f=lim f, = ep, 
noo n=0 


where 
Po =f = fo = 954 a) G, —Sner—STnse* . 


are nonnegative summable functions. According to the final observation 
of Sec. 2.5, we can set ¢, = g, — g*, where the functions g,, and g* are 
nonnegative and belong to L*, and moreover 


Ign < 


tN i 


1 Although, in general, not summable. 


SEC. 6.1 MEASURABLE SETS AND GENERAL MEASURE THEORY 115 


The sum of the series 
[o a) 
ys, 
n -0 
is a function g* € Lt, while the series 
[o @) oD lo @) 
> &2=> on +d 8? 
n=0 n—-O n 0 


converges almost everywhere to a finite limit g, which, according to what 
was just proved, is a measurable function. Therefore 


fo 6) oD [o.@) 
f=> o,=> 2,-—> gt=g8- 2% 
n=0 n=0 n=0 
is measurable, as asserted. 


The example 
ox)=2, O<x<i) 
x 


shows that there are measurable functions which are not summable. However, 
as remarked in Sec. 2.8.1, if a measurable function ¢(x) satisfies the inequality 


lp(x)] < ¢o(), 


where (x) is a nonnegative summable function, then ¢(x) 1s summable. 
Using this fact, we can prove the converse of Theorem | for nonnegative 
measurable functions, 1.e., every nonnegative measurable function is the limit 
of a nondecreasing sequence of summable functions. Indeed, if 


JS (x) = lim h,(x) > 0, 


n> oo 


and if h,,(x) > 0 (as can be assumed without loss of generality), then 


f(x) = lim f,(x) = lim min {f(x), max [h,(x), ... , h,(x)]}. 
The function f,,(x) is summable, since it is measurable, nonnegative and no 
greater than the summable function max {/,(x), . . . , 4,(x)}. Moreover, it is 
obvious that f,(x) 7 f(x) asn— oo. 


THEOREM 2. /f f, is a sequence of measurable functions converging 
almost everywhere to a function f, then f is measurable. 


Proof. There is no loss of generality in restricting f and f, to be 
nonnegative (otherwise, consider f* — lim f+ and f~ — lim f[ separately). 
The measurable function /,(x) is the limit as p —> o of a sequence of 


116 MEASURABLE SETS AND GENERAL MEASURE THEORY CHAP. 6 


elementary functions 4$”’(x), which can be assumed to be nonnegative 
with positive integrals. Consider the function 


© A! (x) 
(n) _"?P 
i Co 1 


where the (positive) coefficients c\”” are such that the series 


00 
> a 
n, p=l 


converges. Then, by Levi’s theorem, (x) is summable, since, by con- 
struction, the series of integrals of the separate terms in the right-hand 
side of (1) converges. Obviously, @o(x) > 0 wherever fo(x) > 0. It follows 
that f(x) can also be represented as the limit of the nondecreasing 
sequence 

£n(X) = min { f(x), n¢o(x)}. 


Because of Theorem 1, we need only verify that measurability of f,(x) 
implies summability of g,(x). But clearly 


ga(x) = min (f(x), m9o(x)} = lim min {fa(x), m90(%)}, 


where the functions min {f,,(x), n@(x)}, m = 1,2,... are measurable 
and bounded by the summable function no (x), and hence themselves 
summable. It follows from Lebesgue’s theorem that their limit function 
g,(x) 1s also summable, and the theorem is proved. 


COROLLARY. Let f,(x), fo(x),... be an arbitrary sequence of meas- 
urable functions. Then the functions 


inf f,(x) = lim inf {f,(x), ..-,f,00)}, 
sup f(x) = lim sup (F100, F402} 


are also measurable, if they are finite almost everywhere, and the same 
is true of the functions 


lim f(x) = lim inf (f,(2), fads -  -} 


lim f,(x) = lim sup {f,(x),Snss(%)s - --}. 
6.2. Measurable Sets 


A set Ec X is said to be measurable if its characteristic function y;(x) 
(equal to 1 on E and O outside F) is measurable. If the function y;(x) is 


SEC. 6.3 MEASURABLE SETS AND GENERAL MEASURE THEORY [17 


summable as well as measurable, the set E is said to be summable, and the 
number u(£) = ly, 1s called the measure of the set E. If a set is measurable 
but not summable, its measure is taken to be + oo. No measure, finite or 
infinite, is assigned to a nonmeasurable set. 

A measurable subset of a summable set is summable (its characteristic 
function is summable, being measurable and bounded by a summable 
function). Any subset of a set of measure zero is measurable and has measure 
zero (as must be expected!). The empty set is regarded as measurable and 
summable, and is assigned the measure zero. 


The formulas 
Xeur = Max (Xz, Xr), 


Xer = min (Xz, Xr), 
XE-F > XE — XK (E> F) 


show that the union, intersection and difference of two measurable sets are 
measurable. Similarly, the union, intersection and difference of two summable 
sets are summable, and moreover 


WE) <u(F) (ESF), 
u(E UF) < u(E) + uF), 
w(EUF)=WE)+ uF) (EF= 9), 
u(E— F)= wW(E)—wF) (E> F). 


6.3. Countable Additivity of Measure 


A key proposition of measure theory Is 


THEOREM 3. If the sets Fy,..., E,,... are measurable, then their 
union = 
E- UE, 
n=1 
is measurable. Moreover, measure is countably additive in the sense that if 
the sets E,,..., E,,... are disjoint, then 
u(E) = w(Ey) + + BE) +++, (2) 


where (2) may reduce to 0 = oO. 


Proof. By hypothesis, each set £,, has a measurable characteristic 
function x, (x). Therefore, by the corollary to Theorem 2, the charac- 
teristic function of the set E, 1.e., 

Xe(X) = sup {X¥~¢,(X),- ~~. Xe, (*),.--} = lim sup {ye,(*), --- 5 Xe, (9}, 


is also measurable, and hence E is a measurable set, as asserted. 


118 MEASURABLE SETS AND GENERAL MEASURE THEORY CHAP. 


To prove the countable additivity (2), we first note that if some y(E,,) 
is infinite then so is u(£), since E > E,, and hence (2) reduces to 00 = ©. 
Therefore we can assume that all the E,, are summable, with u(£,) = 
Iyx,. If the E,, are disjoint, then 


1) => 10,00. 


It follows from Levi’s theorem that y, is summable, with 


Xr = 2 Xe, 
if the series 


¥ Ize, = & WE) (3) 


converges. Conversely, if yz 1s summable, then 


N 

= lyn, < lhe 

n=1 
for any N, and hence the series (3) converges, 1.e., 7), Is not summable 
if (3) diverges. Equation (2) holds in either case, and the theorem is 
proved. 


COROLLARY. If the sets E,,..., E,,... are measurable and E, < 
E, © +++, then their union 
E=—UE, 
n=1 
is measurable, and 
u(E) = lim w(E£,), (4) 


where (4) may reduce to «© = oO. 


Proof lf some u(E,,) is infinite, then so are u(£) and lim u(E,) 
[since u(E£,,.,) = 00 for ali p], and hence (4) reduces to co — oo. Other- 
wise, the formula 

E= £, U(E,— £\) U--: 


represents E as a union of disjoint measurable sets, and therefore, by 
countable additivity, 


u(E) = w(E,) + u(E,— Ey) + °°: = hm > U(E, — E,_,) 


naa k=! 


= lim (E,) (Eo = ©), 


n7 7. 


as required. 


SEC. 6.4 MEASURABLE SETS AND GENERAL MEASURE THEORY !19 


THEOREM 4. If the sets E,,...,£,,... are measurable, then their 
intersection 


is measurable. Moreover, if E, > E, > +++ and p(E,) < ©, then 
u(F) = lim w(E,,), 


where the condition u(E,) < 00 cannot be dropped.* 


Proof. The first assertion follows at once from Theorem 3, after 
taking complements relative to £,. To prove the second assertion, we 
represent £, as a union of disjoint measurable sets by writing 


fF, = FU(E, — £,) VU(E, — £3) Uee:,—", 


and then use countable additivity. 


6.4. Stone’s Axioms 


In addition to Axioms a and b, p. 23, we shall henceforth impose 
two further axioms, called Stone's axioms, on the family H of elementary 
functions: 


c) If h(x) belongs to H, then so does the function min {A(x), 1}, i.e., the 
function A(x) truncated above the level 1. 


d) There exists a sequence of nonnegative functions 4,(x) € H such that 
Ih, > 0 and sup h,(x) > 0 for every x € X. 


Both axioms hold automatically if H contains the function identically equal 
to 1, a case which occurs whenever the space X is of finite “volume” u(X) = 
I(1). However, we want the general case to include integration over spaces 
of infinite volume. 

Axiom c also applies to measurable functions, as we see by passing to 
the limit. Thus if ¢ = lim, is measurable, so is min (g, 1) = lim min (A,, 1). 

Axiom d implies the existence of a summable function 9 (x) which is 
positive for all x € X. In fact, the series 


_$ Lh 
Pol) = 2 n® Ih, 


where the /1,(.x) are the elementary functions figuring in the axiom, converges 
to a summable function, by Levi’s theorem. 


? See Prob. 2, p. 131. 


120 MEASURABLE SETS AND GENERAL MEASURE THEORY CHAP. 6 


The presence of the function ¢ (x) allows us to deduce some new facts 
about the class of measurable functions. First of all, the function f(x) — 1 
is measurable, since 

1 = lim min {1, n@(x)} 


and Axiom c (valid, as noted, for all measurable functions) guarantees the 
measurability of the functions min {1, m@o(x)}. Therefore f(x) 1 is also 
measurable, by Theorem 2. This implies the measurability of the space X 
itself, since X has the characteristic function f(x) =< |. Then the complement 
X — E of any measurable set E (relative to the whole space) is measurable, 
being the difference between two measurable sets. Moreover, if f(x) = I is 
measurable, so is any constant function f(x) ~- c. In particular, if @ is meas- 
urable and a, b, c are any real numbers, then the following functions are all 
measurable: 


1) min (9, c), the function (x) truncated above the level c; 

2) max (9, c), the function (x) truncated below the level c; 

3) max {min (g, 5), a}, the function (x) truncated above the level 6 and 
below the level a. 


6.5. Characterization of Measurable Functions 
in Terms of Measure 


The relation between measurable functions and measurable sets is 
revealed further by 


THEOREM 5. An almost-everywhere finite function 9(x) is measurable 
if and only if the set 


E(9;¢) = {x: ox) > ¢} 
is measurable for arbitrary real c. 


Proof. \f o(x) is measurable, the function 


min (, c + €) — min (gq, c) 


So 
i 


Pec(X) = 


is measurable for arbitrary c and <. The function ©,.(x) equals 0 for 
o(x) < c and 1 for o(x) > c+ ¢, and takes values between 0 and 1. 
As <¢ +0, 9,,(%) approaches a limit equal to 0 for g(x) << ¢ and | for 
o(v) > c. Thus the characteristic functions 7),...,, of the set E(9:c¢) Is 
a limit of measurable functions, which implies the measurability of 
XE: -) and hence of E(; c). 


SEC. 6.6 MEASURABLE SETS AND GENERAL MEASURE THEORY 121 


Conversely, suppose we know that the set E(~;c) is measurable 
for arbitrary c. Then the set 


{x:¢ < 9(x) < d} = E(g; c) — E(9; d) 


is also measurable, for arbitrary c and d (c < d). Given any a, consider 
the function ,(x) equal to k/n on the measurable set 


k 
Epa) = [x= < 9x) < — (20 P55: 


The function ¢,,(x) 1s defined for almost all x, and differs from (x) by 
no more than I|/n. Moreover, 9,,(x) can be written in the form 


ee 6 
P(x) = > 7, Lex ntonlX), 


and hence is measurable. But then (x) 1s also measurable, since 
9, (x) > 9(x) {uniformly on X] as n > oo. This completes the proof. 


Remark 1. \f @ is a nonnegative summable function, then 
0 < min (9, c) < ? 
for any c > 0, and hence f = min (g, c) is also summable. This implies the 


summability of the set E(e;c) = {x: (x) > ch, 


already known to be measurable from Theorem 5. In fact, the characteristic 
function of E(~; c) is summable, since it satisfies the inequality 


0 < Zmie:0(%) < =f). 


Remark 2. We have already seen that f(x) = 1 is measurable. If f(x) = 1 
is also summable, then so is the whole space X, and u(X’) = /(1). In general, 
f(x) ~ 1 is measurable but not summable, and u(X) = +00. In this case, 
it is easy to see that X is the union of an increasing sequence of summable 
sets, 1.e., the sets 


E, = {x: ex) > 4h 


where (x) is the everywhere-positive summable function constructed in 
Sec. 6.4. 


6.6. The Lebesgue Integral as Defined by Lebesgue 


We are now in a position to relate the definition of the integral of a 
summable function © to the measures of certain summable sets constructed 


122 MEASURABLE SETS AND GENERAL MEASURE THEORY CHAP. 6 


from ¢. First suppose ¢ is nonnegative. Then, given any ¢ > 0, consider the 
Sets Ew, = {xine < (x) < (n+ Ne} (= 0,1,2,...). (5) 


Ifn > 1, E,, 1s summable, being the difference between the two summable 
sets {x: o(x) > ne} and {x: o(x) > (n + 1)e}(see Remark | above). Consider 
the function »,(x) equal to ne on E,, (2 = 0,1, 2,...) and 0 elsewhere 
[i.e., where (x) itself vanishes]. Then 


p(X) a? 2 NEXen(X); (6) 
where ¥,,(x) 1s the characteristic function of the set E,,. By Lebesgue’s 
theorem, ,(x) is summable, since 9,(x) < ¢9(x), and 


(o.@) oc 


T0.(x) = Z nely., = 2 new(E,,,). (7) 
Moreover 0 < (x) -- 9,(x) -.. ¢, and hence 9,(x) —> 9(x) as ¢ + 0. There- 
fore, applying Lebesgue’s theorem again, we find that® 


Io = lim I, = lim > neu{x: ne < o(x) < (n+ I)e}. 
e>0 £70 n=1 
The expression on the right is Lebesgue’s original way of defining his integral. 
Conversely, given a nonnegative function ¢, suppose all the sets (5) are 
summable and 


Y new(E.,) < C 


n=1 
for all ¢ >> 0. Then ¢{x) is summable. In fact, by Levi's theorem, the function 
(x) defined by (6) is summable for all ¢, with integral (7). Since 9,(x) -> 
Q(x) as < > 0, as already noted, and since /9, «<< C for all <, it follows from 
Fatou’s lemma (see Sec. 2.8.2) that p(x) 1s summable, with integral (7). 

To treat the case where ¢ is of variable sign, we use the fact that ¢ is 
summable if and only if |9| is summable. Therefore 9 is summable if and 
only if 
DY nelwtx: ne < o(x) < (m+ Ne} + ule: —( t Ne < o(x) < —ne} < € 
n=1 

(8) 


for all « > O (provided the appropriate sets are all summable). Then the 
integral of |p| is given by the limit as ¢ 0 of the left-hand side of (8), 
while the integral of itself is given by the limit as < — 0 of the expression 


Y nelufx: ne < ox) < (a+ Ne} — wx: (a + Ne < 9x) < —ne}] 


* For simplicity, given a set{- --}, we write u{- - -} instead of u({-- -}). 


SEC. 6.7 MEASURABLE SETS AND GENERAL MEASURE THEORY 123 


6.7. Integration over a Measurable Subset 


Until now, the region of integration has been the whole set XY. However, 
it is an easy matter to define integration over an arbitrary measurable subset 
Ec X. First we note that the product of two measurable functions f and g 
is again a measurable function. In fact, confining ourselves to nonnegative 
functions (which obviously entails no loss of generality), we need only note 
that 


{x: f(x)g(x) > c} = U ({x: f(x) > 1} 9 fx: g(x) > e/r}) 


for any c > 0, where r is an arbitrary positive rational number. In particular, 
the product of any measurable function f(x) with the characteristic function 
Xx“(x) of a measurable set E [f(x) replaced by zero outside E£] is again a 
measurable function. Similarly, a summable function replaced by zero 
outside a measurable set is again summable. 

With this in mind, let (x) be an arbitrary function defined on X. Then 
we say that (x) is summable (measurable) on E if the product y,-¢ 1s summable 
(measurable) on X, and we set 


[.° dx I(x); 


by definition. Obviously, the integral over E has all the ordinary properties 
of the integral. Moreover, the following special properties are worthy of 
explicit consideration: 


a) If |p(x)| < M on E, then 


[_ lel dx < Mu(E). 


In fact, yzlo| <My, on X, and hence 


li lol dx = (yx lol) < MIxg = Mu(E). 


b) Jf @ is summable (measurable) on E= E, UE, U*::, where Ey, 
E,, ... are disjoint measurable sets, then 9 is summable (measurable) 
on every E,,. Moreover, if » is summable on E, then 


[,edx=|,edx +], ede te. (9) 
To see this, we note that if y;o 1s measurable (summable) on_X, 
then so is Xz Xe? = Xx, Moreover, Xn, + Xn, +°°*' = Xe, and 


hence 
Xe,P + XE? + = XE: 


124 


c) 


d) 


MEASURABLE SETS AND GENERAL MEASURE THEORY CHAP. 6 


Therefore, if ¢ is summable on E, the partial sums of the series on the 
left are bounded by the summable function y,|9|. This allows us to 
integrate term by term, and leads at once to (9). 


Given a sequence of measurable sets E,, E,,..., which are not neces- 
sarily disjoint, suppose @ is measurable on every E,. Then 9 is also 
measurable on E = E, U E, U:--. In fact, 


Xe? = Xe, vE,U--? = XE,P 1 Xv,-E,E,P 1 XEy-E,E;-E,E,? 1°" "> 


where, by hypothesis and Property b above, every function on the 
right is measurable. Therefore, by Theorem 2, y,¢ 1S measurable, 
as asserted. Let the E,, be disjoint, let p be summable on every E,, 
and suppose the series in the right-hand side of (9) converges. We 
would like to conclude that » is summable and that (9) holds (the 
converse of Property b). This conclusion is in general false (see Prob. 
11, p. 133), but does hold if we impose the extra condition that @ be 
nonnegative on every E,. Then y,¢ 1s the limit of the nondecreasing 
sequence 


Lee To + he (n= 1,2,...), 


with bounded integrals. Therefore, according to Levi’s theorem, y-¢ 
is summable and the relation (9) holds. This last fact is sometimes 
stated in the following equivalent form: /f the function » is nonnegative 
and summable on every set E,, E,,..., where E, © E, © ++: and if 


[, o(x)dx < C 


for all n, then @ is summable on E = E, U E, U+++, and 

I, o(x) dx = tim (x) dx. 
Absolute continuity of the integral on a set. The integral of a summable 
function @ on a summable set E approaches zero as p(£)— 0, 


regardless of the character of E. More exactly, given any « > 0, 
there exists a 8 > 0 such that u(£) < 8 implies 


| [9 dx | <€, 
To see this, let A(x) > 0 be an elementary function such that 


I —. h cae 
(| lol l) : 


SEC. 6.8 MEASURABLE SETS AND GENERAL MEASURE THEORY 125 


The function A(x) is bounded, i.e., 0 < h(x) < M for some M. If the 
summable set E has measure less than 6 = ¢/2M, then 


mo dx| <|. | o(x)] dx <| i |o(x)| — h(x)| dx 


+ J nx) dx <5 4+ MB <e, 


as asserted. 


6.8. Measure on a Product Space 


In Sec. 2.10 we constructed the Lebesgue integral on the Cartesian 
product X x Y of two sets X and Y, starting from a family H(W) of elemen- 
tary functions A(x, y) satisfying the hypotheses of Fubini’s theorem. In 
Sec. 3.3 the existence of such a family was verified for the special case where 
X and Y are finite-dimensional blocks. We are now in a position to construct 
a suitable family H(W) for the case of arbitrary sets ¥ and Y, equipped with 
Lebesgue integrals J, and /;. In fact, for H(W) we make the “natural 
choice,” 1.e., the family of all functions of the form 


hex, 9) => ane er, 


where m 1s arbitrary, 7, (x) is the characteristic function of the set E; < X, 
Xr (y) is the characteristic function of the set F; < Y, and all the £; and F; 
are summable. The family H(W) is obviously closed under the formation 
of linear combinations. Moreover, without loss of generality, we can assume 
that the sets E; < X are disjoint, and similarly for the sets F; < Y. Then, 
if h(x, y) belongs to H(W), so does its absolute value 


a(x, I= ¥ laste, )Hte, 0) 


Therefore H(W) satisfies Axioms a and b, p. 23. 
Next we define an integral /h on H(W). Given any h(x, y) € H(W), let 


Th = 2 XU x(Ej)ry(F;), 


where wy and wy denote the measures in the spaces X and Y, respectively.‘ 
The space H( W), equipped with this integral, clearly satisfies all the hypotheses 


* Thus, for example, u.(E;) = 1,X,,(x), and similarly for wy. 


126 MEASURABLE SETS AND GENERAL MEASURE THEORY CHAP. 6 


of Fubini’s theorem, i.e., every function A(x, y) ¢ H(W) is summable in x 
for all y, the integral 


m™m 


Ixh= > at x(E,)xr(y) 


is summable in y, and 


Ih = > aj x(Eey(F,) = Ty {I xh(x, y)}. 


Moreover, the integral /h satisfies Axioms 1-3, p. 24: 


1) (ah + Bk) = ath + BIk; 

2) Ih > Oif h(x) > 0; 

3) th, Oifh, \ 0. 
Axioms | and 2 are obvious, while to verify Axiom 3, we need only note 
that if 4,(x, y) \\ 0 for every x and y, then /xh,(x, y) \ 0 by Levi’s theorem, 
and hence Jy {I xh,(x, y)} = Ih, — 0 by the same theorem. Therefore we can 
use the Daniell scheme to construct a space L(W) of /-summable functions. 
Since the space H(W) satisfies all the hypotheses of Fubini’s theorem, so 
does L(W). In particular, 


Io = Iy{Ix9(x, y)} 
for every 9(x, y) € L(W). Moreover, we can also write 
Ip = Ix{Iyo(x, y)}, 


because of the symmetry between the roles of x and y in the definition of 
the elementary integral. 


*6.9. The Space L, 


If f(x) is measurable, then so is | f(x)|? for any p > 0, since 
{x: | f(x)|? > C} = {x: | fad] > C7} 


for any C > 0. Consider the space L, = L,(X) consisting of all measurable 
functions f(x), defined on a given set X, for which 


(fl?) =| |f@OP? dx < 


For p > 0, L, is a linear space. In fact, if f belongs to L,, then obviously 
so does af for any real «. Moreover, if fe L,, gE L,, then f+ g€L,, since 
measurability of f and g implies that of f+ g, and 


f+ gl? < USI 4+ lg? < (2 sup (If, gi}? 
= 2?(sup (1 f1?, lg|7)] < 27° fl? + lg”). 


SEC, 6.9 MEASURABLE SETS AND GENERAL MEASURE THEORY 127 


We intend to show that introduction of the (nonnegative) norm 


If p= UAFIDP? =PPdsl) (p> I) (10) 


makes L, into a complete normed linear space. For p = 1, this fact has 
already been proved in Sec. 2.9 (where it was called the Riesz-Fischer 
theorem). Therefore we can now confine ourselves to the case p > 1. The 
norm (10) obviously satisfies Properties a and b on p. 38: 


a) || fl, > O0if fA 0 (almost everywhere), and ||0||, = 0. 
b) |laf ll, = lel ll f ||, for every fe R and every real number « 


It is a bit more difficult to prove the triangle inequality: 


LemMA 1. Jf 1 = (8) is an increasing continuous function such that 
w(0) = 0, and if — = X(m) is the corre- 
sponding inverse function (itself auto- 
matically continuous and increasing), 
then 


xy <]po() 6 +]?am)dq (11) 
for arbitrary x > 0, y > 0. In particular 


7 


ce ee 
xy<—+— (p>i1,q>1), (12) 
p q 


where 


= 1. FIGURE 2 


Proof. The inequality (11) is geometrically obvious from Figure 2. 
To deduce (12), substitute w(€) = ?-! (p > 1) and A(m) = y(?-» into 
(11), and let 


g=—— 1-2 = +. (13) 


LEMMA 2 (Holder’s inequality). If fe L,, g © L,, where 
+-=1 (p> 1,q > 1), 
q 


Kl fgl) < If lls lle le 
Proof. Applying the inequality (12) to the functions | f(x)| and 


|g(x)|, we have ‘ - 
FC la(a)] < APE 4 


If lp =P 1") = 1, ‘tel = F%(\g|*) = 1, (15) 


4 
p 
then 


(14) 
If 


128 MEASURABLE SETS AND GENERAL MEASURE THEORY CHAP. 6 


then, integrating (14), we obtain 


ae 
I(lfgl)<-+-=1. (16) 
P 4 
More generally, if (15) is not satisfied, consider the functions 
f g 
af _ ’ So = 
"Sls lle 
instead. Then || foll, = llgoll, = 1, and (16) implies 
1 fgl) 
I(|fogol) = <1 
ee UF llell Bla 


which is equivalent to (14). 
THEOREM 6. The norm (10) satisfies the triangle inequality 
If+glo<Ifll,>+ lgl, (Ag eL,). 
Proof. Given that f, g¢L,, we have 
If + gl? < (SF) + lg)? = 1 FIST + lad? + lel CF + IgD?, 7) 
and hence, since (13) implies p — 1 = p/q, 
(If 1+ Ig? = fl + lg)?" eL,, 


WF + led? te = PUFA + Ig)? (18) 
Integrating (17), and using Holder’s inequality and (18), we obtain 


HUFL + 1g) < Wf, WF] + lg)? lle + Welle IFT + gD? lle 
= (Nf lla + glad fl + lg)? 


Pf | + lgd?l< If lls + Nelle 
after dividing by J*/*{(| | + |g|)”] and recalling that 


where 


OT 


But then 

IF + silo = PUFF al) < PWS + gD? < WS lla + Iglles 
as required. 

THEOREM 7. The space L, is complete. 


Proof. The proof differs only slightly from that of the Riesz-Fischer 
theorem (corresponding to the case p = 1). Given a Cauchy sequence 
p, € L,, it is enough to show that ¢, contains a subsequence 9, with 


SEC. 6.9 MEASURABLE SETS AND GENERAL MEASURE THEORY 129 


a limit p € L,, since then ¢ will also be the limit of the whole sequence 
0, Lhis follows from the inequality 


and the fact that the second term on the right goes to zero as n> 00 
and 1, - > 00. Clearly, we can always find an increasing sequence of indices 
n, such that 
1 
! Pn — On, ll — ak 
for n > n,. In particular, 


] 
| Prag Qn,llp < 5k 


which implies that the series 
2 | Pres _ n, | (19) 


converges almost everywhere. In fact, 


N p 
( Slene — onl} a 
k= 


iu 
> lPnpat x. Pn, 
k=1 


N N 1 
| Preyr — Prylle =») a 1, 
k=1 k=1 2 


and the assertion then follows by taking the limit as VN — oo and invoking 
Levi's theorem. Since (19) converges almost everywhere, the same is true 
of the series 


2 (Prva = Pn,)s 
with partial sums 


N 
2 (Press = Pn,) = Paya — Pry: 


This means that the sequence Pn, has a limit (almost everywhere) as 
k — oo, Let ¢ denote this limit. Then, for fixed 4, the function 9, — 9, 
approaches 9 — 9,, almost everywhere as p — oo. Since 


] 
Pl, _ Pn, |”) = ln, a Pn, llp < a (p > k), 


it follows from the result of Sec. 2.8.2 that p — 9,, belongs to L,, and 
hence so does ¢ itself. Moreover, by the same result, 


1 
lp — Qn,llo = Ile — on, |”) < a 


130 MEASURABLE SETS AND GENERAL MEASURE THEORY CHAP. 6 


Therefore 9, converges to 7 in the norm of the space L,, and the proof 
is complete. 


It is not hard to see that every elementary function A(x) € H belongs to 
the space L,. In fact, suppose |h(x)| < M (an elementary function must 
be bounded), and consider the set E = {x: |h(x)| > 1}. Then E is summable 
and 

M?” if xeEE, 
lh(x)|? < 
|A(x)| if x €éE. 


In other words, 
JA(x)|? << M?y n(x) + AX) XxX) < M? xxx) + [AO]. 


But both terms on the right are summable, and hence so is |A(x)|?, Le., 
H © L,, as asserted. Actually, much more can be said: 


THEOREM 8. The family of elementary functions H is dense in L,. 


Proof. We want to show that any function fe L, can be approxi- 
mated arbitrarily closely in the L,-norm by an elementary function. 
Since if f belongs to L,, so do f* and f~, there is no loss of generality in 
assuming that f is nonnegative. Consider the increasing sequence of 
measurable sets 


E.=(e:4<se <a] (n= 1,2,...), 


and let 
f(x) if xeeE,, 


0 if x¢E,. 
Obviously f, 7 fand (f — f,)? \ 0, and hence, by Levi’s theorem, 
If —Salle = PPS —f.)?] > 0. 


Therefore, given any « > 0, we can choose n such that 


fF. = : (20) 


SrlX) = | 


The set E,, is summable, since ¥,, = x5, < n?f?. But then the function 
J, is summable, since, by Hélder’s inequality, 


If, = Wren St) < 1Obe,) PS"). 


Therefore, since H is dense in the space L,, as shown in Sec. 2.9, there 
exists a sequence of elementary functions 4, converging to f, in the L,- 
norm (as k — 00). It can be assumed that the A, are nonnegative, since 
the convergence of h, to f, is unaffected by replacing A, by hj. Moreover, 


PROBLEMS MEASURABLE SETS AND GENERAL MEASURE THEORY 131 


it can be assumed that the A, are bounded by the number y, since the 
convergence of h, to f, is also unaffected by replacing h,, by 


min (h,, 2) = n min é hys 7 


where every function on the right is elementary by Stone’s Axiom c 
on p. 119. But then 4, — f, in the L,-norm, as well as in the Z,-norm. 
In fact, 


Nha — Malla = PU Sn — Pal?) = PPO Sn — el? VS — Pel) 


< niP-vipplr(| f — hj) +0 as ko, 
and hence we can choose k such that 
) fn — Balle <> (21) 
Combining (20) and (21), we have 
If —Aglla < hn — Mella + IS —Salla <, 


and the proof is complete, since ¢ is arbitrary. 


PROBLEMS 


1. Prove that a continuous function of one or several measurable functions is 
measurable. 


Hint. A continuous function is a limit of polynomials. 


Comment. On the other hand, a measurable function of continuous 
functions need not be measurable (see Prob. 7, p. 204). 


2. Find a sequence of measurable sets E, > E, > -:- for which 


n=1 N+ 


Hint. The use of finite u(E,,) is precluded by Theorem 4, p. 119. Consider 
the sets E, = {x:n <x < 0}, n=1,2... with empty intersection and 
infinite measure. 


3. Prove that the symbol > can be replaced by > in Theorem 5, p. 120. 


4 (Egorov’s theorem). Let fy, fo,... be a sequence of measurable functions 
defined on a summable set X, and suppose f, converges (almost everywhere) 
to a function f. Prove that given any « > 0, there exists a set E © X with 
u(E) > u(X) — « on which f, converges to f uniformly. 


[32 MEASURABLE SETS AND GENERAL MEASURE THEORY CHAP. 6 
Hint, \t can be assumed that f = 0 and f,, \ 0. Consider the sets 
( 1 
E™ =(x:0<f, <—}. 
m 
Given any « > 0, there exists an n = n(m) such that 


w(AVO ny) 2 uCX) ~ 


n(n) ym ‘ 
Now let 
(ee) 
=fV\—E Vane 
m=1 
5. Let fi, f,... be a sequence of measurable functions defined on a set X, 


and let E be the set on which /,, converges. Show that E is measurable. 


Hint. E aa U ae Vile) - finQOl -< it. 


6. Let f,. fo, ... be a sequence of measurable functions defined on a summable 
set X, and suppose /,, converges (almost everywhere) to a function f/f. Show that 


lim pix: | f(x) —f,@)| > c} =0 for arbitrary c > 0. (22) 


no ew 


Hint. The set 
cO (o-@) 
NU fxs If) = fil ~ ce} 


m=1l n=m 


is empty. 


Comment. We say that a sequence of measurable functions /, converges 
in measure to f if it satisfies (22). 


7. Show that a sequence of measurable functions converging in measure to a 
function f always contains a subsequence converging almost everywhere to f, 
although the sequence itself may not converge almost everywhere to f/f. 


Hint. Given any integers k and m, there exists an n = n(k, m) such that 
1 ] 
Bix T fale) — fOI > 7 ye 


As k —» x«, the sequence /,, converges to f on a set of measure greater than 


I 
ux) -——. 
Hl 
Now consider the sequence fngm my: 


8. Prove that if every subsequence of a given sequence of measurable functions 
contains a subsequence converging almost everywhere to a given function fj 
then f converges in measure to ft 


Hint. Assume the opposite and use Prob. 7. 


PROBLEMS MEASURABLE SETS AND GENERAL MEASURE THEORY 133 


9. Prove that together f,(x) > 0 and Jf, +0 imply f, — 0 in measure, but 
not f, + 0 almost everywhere. Show that the condition f,(x) > 0 cannot be 
dropped. 


Hint. Use Prob. 8. 


10. Introduce the metric 


(fig) = 1 (4) (23) 


in the space .@ of all measurable functions defined on a summable set ¥. 
Verify that o has all the properties of a metric. Show that convergence in the 
metric (23) is equivalent to convergence in measure. Show that .# is complete. 


11. Given the sets 


I |] 
f E=(0, 1), E, = (5: ’ =| (n = 1,2,...), 


construct a function ¢(x) summable on every E,, but not on E, despite the 
convergence of the series 


[2 dx +[,,9@) Be ast, 


Hint. Suppose that 


[,,7°@) eS [p27 pecs 
on every E,. 


7 


CONSTRUCTIVE MEASURE THEORY 


In this chapter, we describe the approximation of measurable sets by 
sets of a simpler kind, which in the n-dimensional case are just blocks and 
their finite and countable combinations. We shall then be able to give a 
constructive definition of a measurable set and its measure. 


7.1. Semirings of Subsets 


A family %& of subsets A < X is called a semiring if it has the following 
two properties: 


1) IFA EU, BEY, then ABE YW. 


2) If A,Ee WM, AeEW and A, © A, then there exist sets Ay,...,A 
such that 


E YI 


A= A,U A, U°::UA,, 
where the sets A;, Aj,..., A, are (pairwise) disjoint. 


Example, The set of all subblocks of an n-dimensional block is a semiring. 
In fact, Properties 1 and 2 have already been stated for blocks on p. 62. 


Next we prove two further properties of semirings: 


3) Let A,,..., A, be & disjoint sets in YL, all contained in a given set 
A € YU. Then there exist sets B,,1,..., B,, € WU such that 


A=A,U:''+UA, UBL, Us UB, (1) 
134 


SEC, 7.1 CONSTRUCTIVE MEASURE THEORY 135 


where the sets A,,..., A,, Biii,..., B,, are themselves disjoint. For 
k = | this assertion is just Property 2 above. Suppose the decomposi- 
tion (1) holds for some integer k. Then, as we now show, it also holds 
for k + 1, thereby leading to a proof by induction. In fact, if A,,,; © A 
and if A,,, intersects none of the sets A,,..., A,, then 


Asa = Ans Ber U t+ U Ani Bn. (2) 
But, by the definition of a semiring, we have 


(1) ( ) 
Bria = Agu Bryr U Ben Us U BAY”, 
Aa eeaghade Paiscaeeee sites uae (3) 
Br = AgstBm U BY U-++ U BEM, 


where the sets Bi),..., Bi?) are disjoint, and similarly for 
B®) ..., B®»), Substituting (3) into (1) and using (2), we obtain 
the desired result. 

4) The union of an arbitrary finite collection A,,...,4,, of sets in 2 
can be represented in the form 
Ay Us UAg = AM Ue GAY ee DAM Ue Am, (4) 


where the sets on the right are all disjoint and belong to YU, and 


A eceyA = Ag “Pei ly ous cin): 


For m = | the assertion is obvious. Suppose the decomposition (4) 
holds for some integer m. Then, as we now show, it also holds for 
m+ 1. In fact, by Property 3, 


A= ApuAr Or AgyAl” UA, OP? OAL. (5) 


where the sets on the right are disjoint and belong to YW. But then, 
combining (4) and (5), we obtain 


where the sets on the right are disjoint and belong to YU, and moreover 


) m 
Amis 6p Anti? © Agar, 
as required. In other words, adding a term 4A,,,, to the union 
A, U-::UA,, leads to the appearance of new terms AQ),,..., 
A‘kn+1) without changing the terms originally in the decomposition. 
Therefore the decomposition also holds for a countable collection 


of sets A,,..., Am)... This fact will be needed later. 


136 CONSTRUCTIVE MEASURE THEORY CHAP, 7 


7.2. The Subspace Generated by a Semiring 
of Summable Subsets 


Let YU be a semiring of suwmmable subsets of a space X (equipped with a 
Lebesgue integral /), and let Hy, be the set of all finite linear combinations 
of characteristic functions of the sets of YU. Then every function 


m 


h(x) = 2 eke, (*) (6) 
in H,, has a well-defined integral 
Th = > a,u(E,). (7) 
k=1 


This immediately suggests the following question: Can we use the Daniell 
scheme to construct an integral (as in Chap. 2), starting from the set Hy 
and the integral (7), and if so, what does the construction give? 

According to Property 4, the sets £,, figuring in (6) can always be regarded 
as disjoint, so that 


A(x] = ¥ heel x29 


again belongs to Hy. Moreover, the set Hy is obviously closed under the 
formation of linear combinations. Therefore Hy, = H satisfies Axioms a and 
b for a family of elementary functions (see p. 23). Furthermore, the integral 
(7) satisfies Axioms 1-3 for an elementary integral. In fact, Axioms | and 2 
are obvious, while, to verify Axiom 3, we merely note that A, \\ 0 implies 
Ih, — 9, by Levi's theorem. Therefore all the prerequisites for constructing 
an integral are satisfied. Let us now see what we obtain from this construction. 

Suppose the sequence /, € Hy, is nondecreasing and has bounded integrals 
Ih, Then the function f — lim/, is summable, by Levi's theorem. Therefore 
the class Ly, obtained from Hy, by the construction of Sec. 2.3, is contained 
in L(X). Completing the construction of the integral by taking differences 
of functions in Ly (as in Sec. 2.5), we arrive at a class Ly, which in turn 
must be contained in L(X). Moreover, according to Sec. 2.9, the class Ly 
is complete in the /(|@|) norm, and hence is a closed subspace of L(X). On 
the other hand, as we know from the same section, the elementary functions 
h é Hy are dense in Ly. It follows that the class Ly is the closure of the set 
Hy, in the I(|9|):norm. 


7.3. Sufficient Semirings 


Consider the set Hy of all finite linear combinations of characteristic 
functions of a// summable subsets of a given set X. Then it is easy to see that 


SEC. 7.3 CONSTRUCTIVE MEASURE THEORY 137 


Hy is dense in L(X). In fact, we need only verify that every nonnegative 
function g € L(X) belongs to Ay, the closure of Hy in the /(|¢|) norm. Let 


<9) N 6) 
Pe = 2 NEfen(X) = 2, Nexen) +d Me xen(*) 
r= n= n + 


be the same function as in Sec. 6.6. Then the first term on the right belongs 
to Hy, while the second converges in norm to 0 as N > oo, since 


co 


= > nev(E,)>~0 as N>o. 
n=N+4+1 


D> Wexen(X) 
n=Nt1 


Therefore », belongs to Hy. But then ¢ also belongs to A, as asserted, 
since 9, 7 9, which implies 


Kp — 9) = lle — ll +9. 
In other words, according to the last remark of the preceding section, 
Ly, = L(X). 

We now ask the following question: When does L,, coincide with L(X), or 
equivalently, when is the semiring & (of summable subsets of X) sufficient 
in the sense that linear combinations of characteristic functions of its sets 
are dense in L(X)? The answer to this question is given by 


THEOREM |. A semiring X of summable sets is sufficient if and only if, 
given any summable set E and any < > 0, there exists a set F, which is 
the union of a finite number of sets of U, such that 


u(E — EF) + WF — EF) <e. (8) 
Proof. If (8) holds, then 
K\X2 — Xl) = Ite — Xl <e, 


and hence the characteristic function of any summable set E is a limit 
(in the L-norm) of linear combinations of characteristic functions of 
sets of the semiring YW. But then linear combinations of characteristic 
functions of sets of 2 are dense in L. 

Conversely, suppose we know that linear combinations of character- 
istic functions of sets of 2 are dense in L. Then, if E is any summable set, 


Xn(x) = lim g,(x) 
for some sequence 
g,(x) = 2 Men L tpg)» Eun EU. 


It can be assumed that the sets £,,, (kK — 1,...,r,) are disjoint for every 
fixed n. Consider the function 


£,(x) = "Val X)s 


138 CONSTRUCTIVE MEASURE THEORY CHAP. 7 


where the sum is taken only over sets £;., such that «,, > 3. Writing 
EF, = Fin Use U E, ns 
we distinguish four possibilities: 


1) xz(x) = &,(x) = lif xe EE,; 
2) Xx(x) = 1, &,(x) = 9, g,(x) < 3, 

Xu(x) — €,(x)| = 1 < 2 |xx(x) — g,(x)| if x © E(X — E,); 
3) Xu(x) = 0, £,(x) = 1, g,(x) > 3, 

Ixe(x) — &,(x)| = 1 < 2 lxx(~) — g,(x)| if xe (X — E)E,; 
4) y,(x) = 0, &,(x) = Oif x E(X — E\(X — E,). 


Thus 
lxz(x) — 8n(0)| < 2 Ixx(*) — g,(X)| 


for all x € X, and hence 


lite — all = Mite — Bal) < 2x — Sal) > 0 (9) 
asn— oO. 

The function g,(x) is itself the characteristic function of some set B,,, 
which 1s a finite union of sets of the semiring 2. We now assert that the 
condition (8) holds, with F — B,, if » is sufficiently large. To see this, 
we note that (vy; — g,,)*" 18 the characteristic function of the set E — EB,, 
while (y~ — g,)” is the characteristic function of the set B, — EB,,. 
But then, according to (9), 


as n—» oo, and the proof is complete. 
Given a family of sets Y&, the family of all sets obtained by forming 
countable unions of sets of 2 will be denoted by ,, while the family of all 


sets obtained by forming countable intersections of sets of YW will be denoted 
by U;. Then we write U5 = (U,)5, Aes, = (Ups), and so on. 


LemMA |. Jf YU is a sufficient semiring, then, given any summable set 
E and any « > 0, there exists a set FEU, such that 
u(E — EF) = 0, u(F — EF) <e. (10) 


Proof. Given any integers m and n, we use Theorem | to find a set 
Fn, Which is a finite union of sets of YW, such that 


(EMEP 2s =. Gro Er eS, 
2" mM 2"m 


Then the set 


b= Fes 


SEC. 7.4 CONSTRUCTIVE MEASURE THEORY 139 


belongs to W,, and moreover 
E— EF,,= 1\(E— EF mn) Fm — EF m= U (Finn — EF mn)- 
Therefore ° " 
u(E — EF,,) < inf w(E — FF,,,,) = 0, 


| ae 2) EN i ae 3 ee 
n-1 m 


and (10) is satisfied by choosing F to be any F,, with m > I/e. 


THEOREM 2. Jf U is a sufficient semiring, then, given any summable 
set E, there exists a set F © U,5 such that 


f u(E — EF) = u(F — EF) = 0. 


Proof. According to Lemma 1, given any integer m, there exists a 
set F,, € U, such that 


u(E — EF,,) = 0, u(F. — EF.) <+., 
m 
Then the set 


belongs to U,3, and moreover 
E—EF=U(E—EF,,), F—EF=(f\(F,, — EF m). 
Therefore . " 
u(E — EF) <  u(E — EF,,) = 0, 
u(F — EF) < inf u(F,, — EF,,) = 0, 
and the theorem is proved. ° 


Remark. Thus any summable set & can be approximated to within a 
set of measure zero by forming countable unions and intersections of sets 
of a sufficient semiring. This follows at once from Theorem 2, since we can 


always write 
E=FU(E— EF) —(F — EF). 


7.4. Completely Sufficient Semirings 


A sufficient semiring YW is said to be completely sufficient if, given any 
set Z of measure zero and any ¢< > 0, there exists a set A € YU, such that 
Z <— Aand u(A) < «.A sufficient semiring need not be completely sufficient, 


140 CONSTRUCTIVE MEASURE THEORY CHAP. 7 


as shown by the example where YW is the family of all measurable subsets 
E < X which do not contain a fixed point x) of measure zero. Both Lemma | 
and Theorem 2 can be strengthened somewhat if the semiring YC is completely 
sufficient. In fact, it turns out that not only can any summable set EF be 
approximated to within a set of measure zero by a set G of the family Y,5, 
but the approximating set G can be chosen to cover E(G > E). 


LEMMA 2. Jf YU is a completely sufficient semiring, then, given any 
summable set E and any « > 0, there exists a set G EU, such that 


G> E, u(G) < u(F) + «. 


Proof. According to Lemma |, given any < > 0), there exists a set 
Fe, such that 


u(E —EF)=0, wu(F — EF) < 


Then 
E=FUZ-—B, 


where Z = E — EF is a set of measure zero and B = F — EF isa set of 
measure less than e/2. In particular, 


u(F) < w(E) + w(B) < WE) + 5. 


Since Z is a set of measure zero and Y% is completely sufficient, Z can be 
covered by a set A € Y_ of measure less than ¢/2. Then the setG =~ FU A 
obviously satisfies all the requirements of the lemma. 


THEOREM 3. Jf YU is a completely sufficient semiring, then, given any 
summable set E, there exists a set GE U,5 such that 


G>E,  u(G) = pl£). 


Proof. According to Lemma 2, given any integer m, there exists a 
set G,, € U, such that 


] 


Then the set 
G—NG,, 


m 
contains E and belongs to Y_,, and moreover 


WCE) < WG) < WG) <w(E) + —. 


Letting m » oo, we find that 4(G) -= u(E), as asserted. Alternatively, 


SEC. 7.5 CONSTRUCTIVE MEASURE THEORY  I/4l 


we can write 
G=E UZ, 


where Z is a set of measure zero. 
There is an analogue of Theorem 3 suitable for measurable sets: 


THEOREM 4. Jf U is a completely sufficient semiring, then, given any 
measurable set E, there exists a set GE U,5, such that 


G>E, wG)=x(2). (11) 


Proof. Let X, < X, © +++ be an increasing sequence of summable 
sets whose union equals the space X (see Remark 2, p. 121). Then every 
EX,, is summable, and obviously 


B= EXD EX: 8 86x 


According to Theorem 3, there exists a sequence of sets G,, € U5 such 
that ‘ 
EX, =G—Zy (1, 25s); 


where every Z,, 1s of measure zero. But then 


E= U(G,,+ Z,,) =G—Z, (12) 
where . 
G=UG,, 
and 
ZA) Z.. 


is of measure zero. Since (12) and (11) are equivalent, the theorem is 
proved. 


7.5. Outer Measure and the Measurability Criterion 


In this section, sets belonging to an underlying completely sufficient 
semiring 2{ of summable sets E < X will be called simple sets. Given an 
arbitrary set E < _X, let 


co 


w*(E) = inf 2 HAn); (13) 


ECALU AU's m 
where the greatest lower bound is taken with respect to all countable 
coverings of E by simple sets A, Ag,... The number *(£) is called the 
outer (or upper) measure of E, and the cases where u.*(E) — 00 or 2*(E) fails 
to exist (FE may have no covering of the indicated type) are not excluded. 
It follows from the very definition of a completely sufficient semiring that 
every set of measure zero also has outer measure zero. 


142 CONSTRUCTIVE MEASURE THEORY CHAP, 7 


THEOREM 5. Jf E is summable, then u.*(E) exists and equals v(E). 


Proof. There is no loss of generality in assuming that the sets 
A,, Ay... figuring in (13) are disjoint, since otherwise, according to 
Property 4, p. 135, we can successively replace every A,, by disjoint 
subsets A‘, ..., A%~), whose union equals A, if m= 1 and A,, — 
A,,A,, if m > 1. But the sum of the measures of the disjoint sets A‘? 
(j< k,,, m= 1,2,...) cannot exceed the sum of the measures of the 
original sets A,, (m = 1, 2,...). Assuming that the sets A,, As,... are 


disjoint, we can write (13) in the form 


uX(E)= inf = (A, UA, U ==>), 
ACA 
using the measurability of A; U A, U--: and countable additivity. 
But obviously u(A, U A, U-+') > (EF), since Ay U Ag Us? > E, 
It follows that 
u*(E) > p(B), (14) 


provided u*(£) exists. On the other hand, by Lemma 2, there exists a 
sequence of sets G,, € U, such that 


Gy>E,  wGp) <p(E) + ~ (m 


| 
pot, 
we 
i? 
. 
ww 


Since G,, is a countable covering of E by simple sets, u*(£) exists and 
moreover 


UNE) < WGy) < WE) + — 
Taking the limit as m— oo, we obtain 
u*(E) < (EZ), (15) 
and the theorem now follows by comparing (14) and (15). 


It is now natural to ask whether measurable sets can be defined directly 
in terms of outer measure. In the case where the space X is summable, the 
answer is given by 


THEOREM 6 (Measurability criterion). If X is summable, then the set 
E < X is measurable if and only if 


u*(E) + u*(CE) = p(X), (16) 
where €E = X — E. 
Proof. \n other words, a necessary and sufficient condition for 


measurability of E is that the sum of the outer measures of & and its 
complement © E (relative to X) be equal to the measure of the whole 


SEC. 7.6 CONSTRUCTIVE MEASURE THEORY 143 
space X. The necessity of (16) is almost obvious, since if E is measurable, 
so is @E, and then (16) follows from Theorem 5 and the relation 

w(E) + w(@E) = u(X). 
To prove the sufficiency, suppose E satisfies (16). Then, given any integer 


m, there exist sets GY”), Gi”) EY, such that GY” > E, GY) > WE and 


u(Gr”) + w(Gye) < U(X) += — (m=1,2,.... (17) 


Let AY and Av) be the characteristic functions of the sets Gi” and 
Gi). Then it is easy to see that 


O< 1 — hex) < xu(x) < AYR), 
where yz is the characteristic function of E, and hence 
< I — hg) < the”. 
On the other hand, (17) implies 


Th? — W(t — hey) = In? + Thee — w(X) 
= w(Ge”) + w(Gen) — u(X) <e. 


Clearly, the sequence A") can be regarded as nonincreasing, and the 
sequence 1 — AY") as nondecreasing. Therefore 


hm —(1— hee) SS 


where f is nonnegative and summable with /f = 0, by Corollary 1 to 
Levi's theorem. It follows that f(x) = 0 almost everywhere, by Corollary 
2 to Levi's theorem. But 


h(x) — [1 — Agi(x)] > A(x) — x2(x) > 0 
and hence 
Xe(x) = lim AY(x) 


mm © 


almost everywhere, 1.e., yz, 18 measurable and the proof 1s complete. 


7.6. Measure Theory in n-Space. Examples 


We now examine what the above general theory gives when the set X 
is an n-dimensional block B. Let o(B) be a continuous nonnegative quasi- 
volume defined on the semiring Y of all subblocks B < B. As in Sec. 5.8, 


144 CONSTRUCTIVE MEASURE THEORY CHAP. 7 


we construct a space L,(B) of o-summable functions, equipped with a 
Lebesgue-Stieltjes integral /,, starting from step functions as elementary 
functions and the quasi-volume o(B) as elementary integral. Then functions 
which are (c-almost-everywhere) limits of sequences of step functions are 
said to be o-measurable, and sets with c-measurable characteristic functions 
are themselves said to be o-measurable. Since B is finite, every o-measurable 
set is automatically o-summable. Let y, be the characteristic function of a 
o-measurable set E © B. Then the o-measure of E is the quantity o(£) = 
I5X¥x, Which obviously reduces to the quasi-volume of £ if £ is a block. 
According to Chap. 6, the family of o-measurable sects E © B is closed 
under the formation of complements and countable unions and intersections. 
Moreover, o-measure is countably additive, in the sense of Theorem 3, 
p. 117. 

We would now like to give a constructive characterization of o-measurable 
sets, using the theory developed in this chapter. First we note that the family 
of step functions H is just the family of all finite linear combinations of 
characteristic functions of sets of the semiring YW. But H 1s dense in L,(B), 
by the Riesz-Fischer theorem. Therefore U is a sufficient semiring. It follows 
from Theorem | that given any o-measurable set E © B and any < > Q, 
there is a finite union of blocks F = B, U-:: U B,, such that 


o(E — EF) + o(F — EF) <.g, 


i.e., to within a set of arbitrarily small o-measure, every o-measurable set 
is a finite union of blocks. Moreover, if Z has o-measure zero relative to 
the integral /, (cf. footnote 1, p. 89), then, by exactly the same argument as 
on pp. 14-15, Z can be covered by a countable collection of blocks B,, Bs,... 
whose quasi-volumes have a sum less than ¢.’ Therefore the semiring Y% 
is also completely sufficient. This allows us to apply all the results of Secs. 
7.4 and 7.5, which in the present context take the following form: 


1) Every o-measurable set is a countable intersection of countable unions 
of blocks minus a set of o-measure zero (Theorem 3). 
2) The outer measure of a set E — B is the quantity 


O*(E) = inf o(B, UB, U-:*), 
ECBiVB2U":: 
and o({E) = o*(E£) if E is o-measurable (Theorem 5). 
3) The set E < B is o-measurable if and only if 


o*(E) + o*(B — E) = o(B) 
(Theorem 6). 


1 Here, however, we do not require every point of Z to be an interior point of a block, i.e., 
a covering by a collection of blocks has the usual set-theoretic meaning (contrary to p. 13). 


SEC. 7.6 CONSTRUCTIVE MEASURE THEORY 145 


DEFINITION. A set obtained from blocks by forming no more than a 
countable number of unions, intersections and complements is called a 
(classical) Borel set. 


Thus every Borel set 1s o-measurable, and every o-measurable set is the 
difference between a Borel set and a set of o-measure zero. (There are sets 
of o-measure zero which are not Borel sets, as shown in Prob. 2, p. 148). 
Every open set G (i.e., every set consisting entirely of interior points) is a 
Borel set and hence o-measurable. In tact, every point of G can be covered 
by a block B — G whose boundary sheets have rational coordinates. Similarly, 
every set F which is closed (relative to B) is a Borel set, being the complement 
of an open set. 

The three examples considered previously in Secs. 4.4 and 5.2 will now 
be examined from the standpoint of o-measure: 


Example 1. \f the quasi-volume o(B) of the block B is its ordinary volume 
s(B), then the o-measurable sets are called Lebesgue measurable (or just 
measurable), and the quantity s(E) is called the Lebesgue measure of E (or 
just the measure of E). All the results of Secs. 7.4 and 7.5, involving outer 
measure and approximation by countable collections of blocks, are valid 
for Lebesgue measure. 


Example 2. Next we find the structure of the o-measurable sets generated 
by the quasi-volume 


o(B) =. g(x) dx, 


where the function g(x) > O is Lebesgue summable over the basic block B. 
Every Borel set G is o-measurable, and in fact 


0(G) = Ioxq = Myee) =|, (x) dx, (18) 


according to formula (1), p. 91 and Theorem 6, p. 107. Every set Z of Lebesgue 
measure zero is o-measurable, with o(Z) = 0. To see this, let G be a Borel 
set such that G > Z, u(G) = 0. Then, according to (18), 


(6) =|, a(x) dx = 0, 


and hence o(Z) < o(G) = 0. Every Lebesgue-measurable set E is o-measur- 
able, being the difference between a Borel set and a set of Lebesgue measure 
zero, and clearly 


o(E) =| _@(x) dx. 


In particular, the set Gy = {x: g(x) = 0} is o-measurable, and 


o(Go) =|, (x) dx = 0. 


146 CONSTRUCTIVE MEASURE THEORY CHAP, 7 


Moreover, every subset Q © Gy also has o-measure zero (although it may 
not be Lebesgue measurable!). Therefore the union of a Lebesgue-measurable 
set and a set Q © Gg is o-measurable. In fact, the converse is true, 1.e., 
every o-measurable set E<— B is the union of a Lebesgue-measurable set 
and a set Q — Gp». To see this, suppose F is o-measurable. Then its charac- 
teristic function y;, 1s o-summable. Hence, as shown on p. 91, the function 
Xxz is Lebesgue summable, and 


o(E) = Tete = Mune) =| x0(xe(x) ax. 
The set G,. — {x: g(x) > 0! is Lebesgue measurable, and obviously 
E a EG, UU EG... (19) 


Since the set EG, = {x: yx(x)g(x) > 0} is also Lebesgue measurable, (19) 
represents E as the union of a Lebesgue-measurable set and a set on which 
g(x) vanishes, as required. 

On p. 91 it was shown that if @ is o-summable, then the product 9g¢ Is 
Lebesgue summable. We are now in a position to prove the converse (as 
promised), 1.e., if the product 9g is Lebesgue summable, then ¢ is o-summable. 
First we show that ¢ is o-measurable. Given any C, let E,.(@) be the set of 
points where the inequality (x) < C holds. This set coincides with the set 4 
where the inequality o(x)g(x) < Cg(x) holds, except possibly for a set 
A, © A on which g(x) vanishes. The set Ais Lebesgue measurable and hence 
o-measurable, while A, has o-measure zero. Therefore E,.(¢) is o-measurable, 
and hence ¢ is o-measurable, since C is arbitrary. To prove that ¢ is o- 
summable, we need only show that the integrals /,9,,, where 9,,(x) = 
min {|(.x)|, 7}, form a bounded sequence. The function 9,, is o-measurable 
(since @ 1s) and bounded. Therefore ¢,,, is s-summable, and moreover 


169m = ng) < Ml g), 


as asserted. Thus the class L, of o-summable functions has now been com- 
pletely characterized: A function @ belongs to L, if and only if the product 
og is summable in the ordinary sense. 


Example 3. Consider the c-measure generated by the quasi-volume 


o(B) a > Em: 
CmE DB 
where ¢,,.... C++» IS a Sequence of points in the basic block B, and 
21,--++Lms IS a Corresponding sequence of real numbers such that 


Sen<0 (gm >0) 


SEC. 7.7 CONSTRUCTIVE MEASURE THEORY 147 


(cf. Example 3, p. 91). Then every set E < B is o-measurable, with o- 
measure given by the formula 


o(E) oa PoE a 2 Em: 
CmEE 


In fact, E differs from E, © E, the set of points c,, contained in E£, only by a 
set of o-measure zero. But Ey is o-measurable, since only countably many 
points c,, lie in Ey. Therefore E£ itself is o-measurable, and moreover 


o(E) = o( U (en) = > on 


CmeE CmEeE 
as asserted.? 


7.7. Lebesgue Measure for n = |. Inner Measure 


Finally we consider in more detail the simplest case n = I, where B = 
[a,b] and uw is Lebesgue measure. As is well known, every open set G © B 
is the union of a countable number of disjoint open intervals G; (the com- 
ponents of G). Therefore G is measurable, with measure equal to the sum 
of the lengths of the intervals G,. But the half-open intervals (a, 8] < B 
form a completely sufficient semiring of summable subsets of B, and obviously 
u(a, B) = B — «. Therefore we can define the outer measure of an arbitrary 
set E © B as the quantity 


w*(E) = inf (6), 
ECG 


where the greatest lower bound is taken with respect to all open sets G 
containing £. Then, according to Theorem 6, a set E < B is measurable if 
and only if 


u*(E) + w*(CE) = w(B) = b —a. (20) 


The sets satisfying (20) are precisely those called measurable by Lebesgue, 
and used as the starting point of his theory of measure and integration. 

There is another, equivalent definition of measurable sets E < B, which 
is worth mentioning at this point. First we introduce the concept of the 
inner (or lower) measure of a set E, defined as the quantity 


ux(E) = sup u(F), 
FCE 


where the least upper bound is taken with respect to all closed sets F 
contained in £, and u(F) is the measure of F.? 


? By {cn} we mean the set whose only element is ¢,,. 
3 As already noted on p. 145, every closed set F is measurable, being the complement 
of an open set. 


148 CONSTRUCTIVE MEASURE THEORY CHAP. 7 


THEOREM 7. The set E © B = [a, 6] is measurable if and only if 
wy(E) = p*(E). (21) 


Proof. Obviously, we have 
U4 (E) = sup u(F) = sup [6 — a — w(@F)] 
FCE FCE 


=b—a—infu( F)=b—a-— inf pw(F). 
PCE E 


CECCF 


But F is closed and contained in F, and hence @F is open (relative to B) 
and contains 6 £. Since 6 F is an arbitrary open set, it follows that 


uy(E) + w*(CE) = b — a. 


Therefore (20) and (21) are equivalent, as asserted. 


PROBLEMS 


1. Show that the family of all Borel sets in the interval (0, 1] has the power 
of the continuum. 


Hint. A Borel set can be specified by using a sequence of closed sets.‘ 


2. Show that the family of all Lebesgue-measurable sets in the interval [0, 1], 
even just sets of measure zero, has a power greater than the power of the 
continuum. 


Hint. Any subset of a set of measure zero is also of measure zero. Use 
Prob. 2, p. 21. 


Comment. Problems 1 and 2 show that there exist measurable sets which 
are not Borel sets. 


3. Given a family .~ of measurable sets, no two of which differ by more than 
a set of measure zero, show that the power of .# is no greater than the power 
of the continuum. 


Hint. Every measurable set is a Borel set to within a set of measure zero. 


4. Construct a measurable set E which has positive measure both on every 
interval and on the complement of every interval. 


Hint. Construct the analogue of the Cantor set but with positive measure, 
and then repeat the construction in every interval adjacent to E£ (see Probs. 2 
and 3, pp. 21-22). 


‘Por further details, see F. Hausdorff, Mengenlehre, third edition, Walter de Gruyter & 
Co., Berlin (1935), p. 181. 


PROBLEMS CONSTRUCTIVE MEASURE THEORY 149 


Comment. The function 7,(x) is not Riemann integrable, and cannot be 
made Riemann integrable by modification on any set of measure zero. 


5. Prove the following result due to N. N. Luzin: Given any « > 0 and any 
measurable function f(x) defined on a finite block B, there exists a closed 
set F < B, with measure u(F) > u(B) -- ¢, on which f(x) is continuous. 


Hint. For step functions, the proof is immediate. To treat the general 
case, use Egorov’s theorem (Prob. 3, p. 131). 


6. A measurable function g(y) defined on a set Y with a measure 2 is said to be 
equimeasurable with a measurable function f(x) defined on a set X with a measure 
vu if Myre(y) ~ Ch} wtx: f(x) < C} for arbitrary C. Prove that given any 
measurable function f(x) on an abstract set X with measure », there exists a 
nondecreasing equimeasurable function g(y) on the interval [0, 1] with ordinary 
Lebesgue measure. 


Hint. If F(a) = u{x: f(x) < a}, then 
' e(y) = inf «. 
F(a) <y 
7. A function f(x) defined on a block B is called a Bore/ function if every set 
E( fc) - (x: f(x) < c} is a Borel set. Show that every measurable function 
f(x) on the block B can be made into a Borel function by suitable modification 
on a set of measure zero. 


Hint. Each of the countably many sets 


A+ k 
e( 1) ~ #(155 


becomes a Borel set after deleting some set of measure zero. The union of all 
the deleted sets is a set of measure zero, on which we can set f(x) O, say. 


8 


AXIOMATIC MEASURE THEORY 


eee 


We now consider another way of constructing a theory of the integral, 
whose starting point is a family of subsets of an arbitrary set X equipped 
with a countably additive measure. The new approach is both simple and 
straightforward, if we use our prior knowledge of the Danieil scheme. 


8.1. Elementary, Borel and Lebesgue Measures 


A family YU of subsets of a set X is called a ring if it has the following 
two properties: 


1) If AeA, BEA, then AU BE, ABE A. 
2) If ACU, BEA and Bc A, then A — BEA. 


The sets belonging to a ring UW are called elementary sets. The set X itself 
may or may not belong to YW. In the latter case, it is assumed that X is the 
union of a countable number of elementary sets. 

By a countably additive measure we mean a finite nonnegative additive 
set function u(A), which is defined on a ring YU and has the following property : 


a) If A,,...,A,,... Is any sequence of disjoint sets belonging to U 
whose union 
A= UA, 
n=1 


also belongs to U, then 
u(A) = w(A,) + +++ + wA,) +-°°- 
For brevity, such a measure is also called an elementary measure. 


150 


SEC. 8.1 AXIOMATIC MEASURE THEORY  |5| 


Two further properties of an elementary measure are easily deduced 
from Property a: 


b) If A, © A, © +++ is any increasing sequence of sets belonging to Y, 
whose union 
A= UA, 
n=1 


also belongs to UW, then 


u(A) = lim p(4,). 


In fact, 
A = A, U(A, — Ay) Uses—=" 
where the sets A,, A, — A,... are disjoint, and hence, according to 
Property a, 
¢ p(A) = (Ay) + w(A2 — Ay) + °° + = lim p(A,). 

c) If A, > A, > +++ is any decreasing sequence of sets belonging to Y, 

whose intersection 

A=Nf)A, 
n=1 


also belongs to U, then 
(A) = lim u(A,,). 


This follows from the preceding property by taking complements 
(relative to A,). 


A ring % is called a o-ring if given any sequence 4,,...,A,,... of 
disjoint sets belonging to Yt and contained in a fixed (but arbitrary) set 
A, € U, their union 


also belongs to Y&. The restriction to disjoint sets can be dropped immediately. 
In fact, if the sets A, are allowed to intersect, we can write 


A= A, U (A, — A,A,) U (Ag — AyAg — A2d3) Uore;= 


where the sets on the right are now disjoint, and moreover belong to WU and 
are contained in Ay. Similarly, the intersection of any sequence of sets 
A,,...,A,,... belonging to Y also belongs to YU. To see this, we merely 
observe that 


A, — NA,A, = N (A, — A, Ay) 
n=} 


n=l 


152 AXIOMATIC MEASURE THEORY CHAP. 8 


belongs to Wf, and hence so does 
NA, = 4, — (4, — f) A,A,) 
n=1 n=1 
The sets belonging to a o-ring are called (abstract) Borel sets. 

A countably additive finite measure uw defined on a o-ring 1s called a 
Borel measure, and relative to the measure u, the sets Ae YW are called 
measurable (more exactly, u-measurable). A o-ring Yt with a Borel measure wu 
is called a o,-ring if every subset of a set ZEW of measure zero also 
belongs to & (and hence has measure zero itself). In this case, the measure 
uw 1s Called a finitely-Lebesgue measure. 

A ring 2 of subsets of the set X is called a 2-ring if given any sequence 
A,,...,A,,... Of disjoint sets belonging to Y, their union also belongs to UW. 
Just as before, it is easy to see that the sets need not be disjoint, and that 
the intersection 

1) 4, 

n=1 
belongs to YI. It is always assumed that the set X itself is an element of the 
2-ring. The sets belonging to a &-ring are called generalized Borel sets. 

A nonnegative additive set function u(A), defined on a 2-ring YW and 
taking oo as a possible value, is called a generalized Borel measure if 


(tO 4,) = wade bay bee 


where A,,...,A,,... 18 any sequence of disjoint sets belonging to YL (here 
the left and right-hand sides are allowed to take the value 00). Relative to 
the measure pu, the sets AecW are called measurable (more exactly, 
u-measurable), and the sets such that w(A) < oc are called suwmmable 
(u-summable). If u(X) = 0, it is assumed that ¥ ="X, U X, U---, where 
X,< X, c++: and every X, (€ YW) is of finite measure. A X-ring YW with a 
generalized Borel measure p is called a 2,-ring if every subset of a set 
Z EA of measure zero also belongs to YW. In this case, the measure p 1s 
called a Lebesgue measure. 


Example I. As in Chap. 2, let X be the domain of a space of elementary 
functions, equipped with an elementary integral, and use the Daniell scheme 
to construct an integral. Then the ‘‘Daniell-measurable™ sets (see Chap. 7) 
form a N’,,-ring, on which w is a Lebesgue measure, while the “Daniell- 
summable”™ sets form a o,-ring, on which uw is a finitely-Lebesgue measure. 


Example 2. The bounded classical Borel sets (see Sec. 7.8) on the real 
line —0oo < x < oc form a o-ring, on which ordinary Lebesgue measure 
(or a Lebesgue-Stieltjes measure) is a Borel measure (but not a finitely- 
Lebesgue measure). The family of all Borel sets on the real line forms a 
N-ring, on which ordinary Lebesgue measure is a generalized Borel measure. 


SEC. 8.2 AXIOMATIC MEASURE THEORY 153 


Example 3. The half-open intervals («, 8] and their finite combinations 
form a ring (but not a o-ring), on which ordinary Lebesgue measure (or a 
Lebesgue-Stieltjes measure) is an elementary measure. 


8.2. Lebesgue and Borel Extensions of an Elementary Measure 


We begin by proving 


THEOREM |. Let uw be an elementary measure defined on a ring U of 
subsets of a set X, and let H be the family of finite linear combinations 


h(x) = Savle,(o (1) 


of characteristic functions of the elementary sets. Then H has all the proper- 
ties of a family of elementary functions, and the integral 


th = J. h)u(de) = Yauu(E) 2) 


has all the properties of an elementary integral. 


Proof. Obviously H is a linear space, and hence satisfies Axiom a, 
p. 23. Moreover, the sets £, figuring in (I) can always be chosen to be 
disjoint, and then 


|h(x)| = D lore! Xz,(x) 
k=1 
also belongs to H, which proves Axiom b, p. 23, while 
min (h, 1) = > min («,, 1)Xz,(x) € H, 
k=1 


which verifies Stone’s Axiom c, p. 119. As for Stone’s Axiom d, if 
u(X) < oo, then A,(x) — | 1s a sequence of nonnegative functions such 
that 

Th, > 9, sup h,(x) > 0 (x EX), (3) 


while if u(X) = oo, then, by hypothesis, X is the union of an increasing 
sequence of elementary sets X,, and the sequence of nonnegative func- 
tions A,(x) = xx,(x) again satisfies (3). Thus H has all the properties of 
a space of elementary functions. 

Next we turn to the proposed elementary integral (2). The fact that 
I is jinear and nonnegative, 1.e., that / satisfies Axioms | and 2, p. 24, 
is immediately apparent. As for Axiom 3, we must show that if A,,(x) \ 0 
for every x, then /h, — 0. First we note that if the integral of an elemen- 
tary function over an elementary subset A © X is defined by the natural 


154 AXIOMATIC MEASURE THEORY CHAP. 


formula / jh = 7x ,/ (as in Sec. 6.7), then |J,A| < Mu(A) if |A(x)] < 
and ee — 1 ,h + Iphif A and B are disjoint. Now let E < X be ie 
set where /1,(x) > 0 [there 1s no need to consider the set where /i,(x) = O}, 
and let 

Ab™ — EX {x:h,(x) < 1/m}, 
Since 


h, (x) = Sars gi”'(x), 


where the E”) can be regarded as disjoint (for fixed 7), the set A)” is a 
finite union of certain £,"? (in fact, those for which «{” < 1/m), and 
hence is an elementary set. For fixed m, the sets Aj)” patie as n in- 
creases, and clearly 


E=UA™ (m=1,2,...). 
n=l 


Therefore 

w(E) = lim u(A,”), 
by Property b, p. 151. In other words, we can find an integer n — n(m) 
such that 


1 
u(A i) > w(E) — — 
m- 
and hence 
1 
u(E A )) 7 
m- 
It follows that 
1 


if mn > n(m), where M — max h,(x). But then /h, 0, since m can be 
made arbitrarily large. Thus, finally, (2) has all the properties of an 
elementary integral, and the theorem is proved. 


Next we show that every elementary measure can be extended to « 
Lebesgue measure: 


THEOREM 2. Let X be a ring, equipped with an elementary measure \. 
Then there is a ¥),-ring U, equipped with a Lebesgue measure wp, such that 


AW > Wand (A) = u(A) for every AEX. 


Proof. Let H denote the set of linear combinations (1) of character- 
istic functions of sets of U, and use formula (2) to define an elementary 
integral on H. Then Theorem | allows us to apply all the results of Chaps. 
2 and 6, thereby constructing a space L(X) of summable functions and 


a \’-ring YU of measurable sets equipped with a measure u, where yu is 


SEC. 8.2 AXIOMATIC MEASURE THEORY 155 


a Lebesgue measure (as already noted in Example 1, p. 152). Obviously, 


% > Wand (A) = Iy4 = uA) for every A € U, and hence the measure 
can serve as the desired Lebesgue extension of the elementary 
measure uw. 


Remark. \t is clear from the very construction of L(X) that the original 
ring UW is sufficient (see Sec. 7.3), i.e., linear combinations of characteristic 
functions of sets of 2 are dense in L(X). The ring YW is also completely 
sufficient, as defined in Sec. 7.4. In fact, let Z be a set of -measure Zero. 
Then, given any ¢ > 0, there exists a nondecreasing sequence of nonnegative 
functions 4, € H such that /h, < ¢ and suph,(x) > 1 on Z. Suppose that 


h,(x) = D> %enXEunlX)s 
k=1 


where the u-summable sets £,,, are disjoint (for fixed ”). Let G,, be the union 
of the sets £,,, with coefficients a,, > 4. Then u(G,) < 2¢, since Jh,, < «. 
Moreover 


G=UG, > Z, 
n=1 
since suph,(x) >1 on Z. But the sequence h,, is nondecreasing, and hence 
G, < G,< +++. Therefore u(G) = lim W(G,) < 22, as required (note that 


Ge XY,). 


THEOREM 3. Let U be a > -ring of subsets of X, equipped with a 
Lebesgue measure wy. such that u(X) < 00, and let U and wu be constructed 
as in Theorem 2. Then UX = Wand p(A) = u{A) for every A EX. 


Proof. Roughly speaking, the construction of Theorem 2 leads to 
nothing new if the original ring is a >),-ring and the elementary measure 
is a Lebesgue measure. First let Z be a set of u-measure zero. As already 
noted, the elementary sets form a completely sufficient ring. Therefore, 
given any m = I, 2,..., the set Z can be covered by a finite or countable 
union E,, of elementary sets E,,,. (A = 1, 2,...) such that u(E,,) < 1/m. 
Clearly E,, € XU, and moreover Z < () E,, = E, where FEU, w(E) = 0. 
But then Z € YU, u(Z) = 0, since uw is a Lebesgue measure. 

Next let A be an arbitrary «z-summable set. Then, according to 
Theorem 3, p. 140, A can be represented in the form A = G — Z, 
where GeW., u(Z) = 0. Therefore Ge U, since W is a D-ring, and 
moreover Z € YU, as just proved. It follows that A — G — Ze, and 
hence % = Y, as asserted, where obviously (A) = (A). 


Remark. Suppose we start from a o-ring W of subsets of X, where 


X EA, equipped with a finitely-Lebesgue measure uw. Then in general W is 
larger than 2 not only because it contains sets of infinite u-measure, but also 


156 AXIOMATIC MEASURE THEORY CHAP. 8 


because it contains sets of finite u-measure not contained in YW itself. For 
example, let 2 be the o -ring of bounded Lebesgue-measurable sets on the 
real line —0o < x < ©, equipped with ordinary Lebesgue measure. Then 
M consists of all Lebesgue-measurable sets on — oo < x < 00, including 
unbounded summable sets. 


In general, a given ring Yt equipped with an elementary measure w can 
be extended in many ways to a >) -ring equipped with a Lebesgue measure u 
such that g(A) — (A) if Ae. Every such measure pu will be called a 
Lebesgue extension of the elementary measure u. Two different Lebesgue 
extensions of u always lead to another, as shown by 


THEOREM 4. Given an elementary measure wu, defined on a ring XU 
of subsets of a set X, let u, and uw, be two Lebesgue extensions of 2, defined 
on di -rings U, and YA, respectively. Let B be the family of all subsets 
A <= X on which both uw, and wy are defined and 11,(A) — w.(A). Then 
B is ay, -ring, and the set function 


V(A) = Uy(A) = pe(A), 


called the ':.‘2rsection of the measures 4 and Us, is a Lebesgue measure 
on 8 (in fact, a Lebesgue extension of 1). 


Proof. \fA, Be Bandif A < B,thenB — A €%,since B — AE, 
B-— Ae XY, and 


4(B — A) = p(B) — py(A) = p(B) — Yeo(A) = we(B — A). 
Moreover, if A,,...,A,,... 1S a sequence of disjoint sets of B, with 
union 

A=UA,, 


n=1 


then A also belongs to 8, since A € U,, A € U, and 


us(A) = ¥ ta(Ay) = S Ho(An) = val 4) 


Finally, if £E © Ey¢B and if u,(£o) = p( Ey) = 0, then E belongs to 
both of the S.-rings YW, and ,, in each of which it has Lebesgue measure 
zero, 1.€., Ey belongs to ‘8B and wE£,) = 0. It follows that B is a 3).,-ring 
(containing YU) and that v is a Lebesgue measure on %. The fact that 
v(A) = (A) if A € Wis immediately apparent. 


Similarly, we can show that the intersection of an arbitrary number of 
Lebesgue extensions of an elementary measure u (defined in the obvious 
way) 1s itself a Lebesgue extension of uw. In particular, let u* be the inter- 
section of a// Lebesgue extensions of the measure u. Then clearly u* is the 


smallest Lebesgue extension of u, in the sense that the intersection of u* 


SEC. 8.2 AXIOMATIC MEASURE THEORY [57 


with any other Lebesgue extension of u is again *. In fact, as we now show, 
u.* has already been constructed in the proof of Theorem 2: 


THEOREM 5. The Lebesgue extensions 2 and w* of the elementary 
measure wu. coincide. 


Proof. We need only show that the intersection u, of any Lebesgue 
extension of uw with the Lebesgue extension @ coincides with uw. Let 
be the ring on which uy. is defined, and let Ij, YW be the }) -rings on which 
Uy, 2 are defined. Then it is enough to prove that U, = U. Obviously 


W< W, < A, and hence, by the considerations of Sec. 7.2, % < UW, < y 
if we extend the measures pu, u, and uw themselves. But according to 


Theorem 3, UW, — %, W - - UW, since U, and W are already LX, Tings. It 
follows that Uc W, < YW, ie., WX, = YW, and the theorem is proved. 


Remark. It can be shown that given any u-nonmeasurable set Y < X, 
there is always a Lebesgue extension of the measure @ in which Y is measur- 
able (see Prob. 2, p. 178). 


Next we study Borel extensions of the elementary measure yw. Instead of 
the D-ring YM of ~i-measurable sets constructed in Theorem 2, consider the 
o-ring of @-summable sets, which we continue to denote by the symbol YW. 
Then ji a is a Borel extension of the elementary measure u, in the sense that 2 
is a o-ring (in fact, a o-ring) containing %& and p(A) = p(A) if Ae UW." 
However, in general u is not the “smallest Borel extension” of u, which is 
constructed as follows: Let 2(* be the intersection of all o-rings containing 
YW, equipped with the Borel measure u.* defined by the formula w*(A) = 
u(A) if A € U* (note that U* < W). Clearly, W* is the smallest o-ring con- 
taining Q, a fact which suggests calling w* the smallest Borel extension of w. 
To justify this definition, we must prove that it 1s consistent with our previous 
definition of the smallest (Lebesgue) extension of wu: 


THEOREM 6. The intersection u, of any Borel extension of w with the 
Borel extension w.* coincides with w*. 


Proof. First we note that Theorem 4 obviously remains true if we 
change the word “Lebesgue” to “Borel” and the symbol %), to o,. Let 
W be the ring on which u 1s defined, and let Y,, WU* be the o- rings on which 
u,, u* are defined. Then it is enough to prove that MU, = YW*. But YX, < 
Y%*, since uw, iS an intersection with the measure u*, while on the other 
hand %* < Y,, since YW, is a o-ring containing YW and Y* is the smallest 
such o-ring. Therefore U, = U*, as required. 


1 Jn fact, @ is a ‘‘finitely-Lebesgue extension”’ of yu, in an obvious sense. 


158 AXIOMATIC MEASURE THEORY CHAP. 8 


Remark. As already noted, u and u* do not coincide in general (unlike 
the case of Lebesgue extensions). In fact, YW is obtained from %* by forming 
all possible unions of sets A € U* with subsets of sets Z € U* of measure 
zero. To see this, we note that on the one hand, the smallest finitely-Lebesgue 
extension of uw must contain all the sets so obtained, while on the other 
hand, the construction clearly leads to a finitely-Lebesgue extension of wu. 


8.3. Construction of the Integral from a Lebesgue Measure 


Let U be a \.-ring of subsets of a given set X, equipped with a Lebesgue 
measure uw. Then to construct a theory of Lebesgue integration on X, we 
proceed as follows: A function (x), defined on_X, is said to be u-measurable 
if the set 

E(9; c) = {x: (x) > ce} 
is u-measurable for arbitrary real c. If p(x) 1s u-meagsurable, the set 
{x1 ¢ < 9(x) < d} = E(9; c) — E(g; d) 


iS also u-measurable for arbitrary c and d (c < d). The Lebesgue integral 
of a nonnegative measurable function (x) 1s defined by the formula 


Io = lim 5 neu{x: ne < 9(x) < (n+ Ie}, (4) 


E70 n= 


provided the limit exists, and the function ¢(x) is then said to be u-summable. 
A u-measurable function ¢(x) of variable sign is said to be u-summable if 
its positive and negative parts 9‘(x) and 9 (x) are u-summable, and the 
Lebesgue integral of @ is then defined as the difference /p* — I~. 

Next we show that the integral J defined by (4) has all the customary 
properties of the integral, as contained, say, in the theorems of Chap. 2, 
Secs. 6-9. It would be tedious to verify this directly, but there is no need to 
do so. In fact, starting from the Lebesgue measure u, we construct the family 
H of linear combinations of characteristic functions of u-summable sets, 
equipped with the integral (2) of Theorem 1, afterwards using this theorem 
and the Daniell scheme to extend J to a space L. Then, just as in Chap. 6, 
we consider the resulting *““Daniell-measurable” sets and functions. According 
to Theorem 3, p. 155, the “Daniell-measurable” sets are just the original 
u-measurable sets, while, according to Theorem 5, p. 120, the “Daniell- 
measurable” functions are just the u-measurable functions defined above. 
Moreover, according to Sec. 6.6, the ‘‘Daniell-summable” functions are just 
the u-summable functions defined above, and the ‘“Lebesgue-Daniell”’ 
integral of a given u-summable function ¢ has the same value as the Lebesgue 
integral (4). Thus all the properties of the integral established in Chap. 2, 
remain valid for the Lebesgue integral with the present definition. 


SEC. 8.4 AXIOMATIC MEASURE THEORY 159 


8.4. Signed Borel Measures 


So far, the measure u defined on a ring % of subsets of a set X has been 
assumed to be nonnegative. We now consider “signed measures,” i.e., 
measures which can take values of either sign, eventually proving that every 
such measure is the difference between two nonnegative measures (see 
Theorem 8 below). 

Let u(£) be a finite additive set function defined on a o-ring Y, and 
suppose that u(£) can take values of either sign. Suppose further that u(£) 
is countably additive in the sense that 


w(E) = YwlE,) 


Lo i5 eas E,,... 18 any sequence of disjoint sets of YW, all contained in 
some set Aye. Then u(E) is called a (signed) Borel measure. Obviously, 
u(E) still satisfies Properties b and c, p. 151. 


THEOREM 7. Let Ube ao-ring equipped with a signed Borel measure u. 
Then the set function 


ACE) = sup pA), (5) 
ACE 
where the least upper bound is taken with respect to all u-measurable sub- 
sets of E, defines a nonnegative Borel measure on X&. 


Proof. After first observing that A(E) > 0 and A(E) > u(E) [since 
A can always be chosen to be either the empty set or the set F itself] and 
that the possibility A(E) = +00 is not excluded a priori, we proceed to 
establish the proof in three steps: 


Step 1. (E) is subadditive, i.e., if Ey, Eg, ... 1S a sequence of disjoint 
sets of QU (all contained in some set Ay € 2), then? 


ME, U EE, U++*) < ACE) + ACE) +e. (6) 


In fact, let A be any u-measurable set contained in £, U £; U-:-. Then 
A = AE, UAE, U--- represents A as a union of disjoint sets. Therefore, 
since 1 is countably additive, 


u(A) = w(AE,) + U(AE,) + 68 < MAE,) + AE) + °°, 


and taking the least upper bound of the left-hand side with respect to A, 
we obtain (6). 


? Clearly, E, U E, U*+ > is u-measurable, since Q is a o-ring. 


160 AXIOMATIC MEASURE THEORY CHAP. 8 


Step 2. XE) is finite. Suppose, to the contrary, that A(E) = oo for 
some u-measurable set E. Then, by induction, we can construct a 
sequence of u-measurable sets 


E, > £), 2°32 £,,>°°° (7) 

such that 
ME,,) = 0, |w(E,,)| > m. (8) 
First set Ey = E, a choice which obviously satisfies (8) for m = 0. Then 
suppose sets Ey > E, > --: > E,,_, satisfying (8) have already been 


constructed. Since A(E,,,_,) = 0, there is a u-measurable set 4,, © E,,-4 
such that 


u(A,) > m+ |WE,,1)I- 


If X(A,,) = 0, we can set E,, = A,,, thereby completing the induction. 
However, if A(A,,) is finite, A(E,,_1 — A,,) must be infinite, since other- 
wise (6) would be contradicted, and moreover 

[U(E — A,,)| > (A ,.) = [W(E,n,-I = mM. 


Thus, in this case, we complete the induction by choosing E,, == E,,1 — 
A,,. In any event, once having constructed a sequence of sets (7) satisfying 
(8), we note that the numerical sequence (E,,,) must have a limit [equal 
to u(f) E,,,)], since the measure pu is countably additive. But this contra- 
dicts (8), and hence A(E) is finite, as asserted. 


Step 3. ME) is countably additive. Let E,, Ey,... be a sequence of 
disjoint sets of YU, all contained in some set Ay € YU. Given any < > 0 
and any integer m = 1, 2,..., let A,, © E,, be a set such that 


ME) <U(Am) + a 


Such a set A,, exists, since, as just shown, A(E,,) 1s finite. Then 


ME}) + ME) + 7° < (Ai) + (Ag) + °°: re 
= wW(A,UA,Uss)+te<c NE, UE, Us) +e, 


and making ¢ approach zero, we obtain 
ME\) + (Eg) +0 < MEY U EZ Us). (9) 


Together, the inequalities (9) and (6) imply that A(Z) is countably 
additive. This completes the proof. 


THEOREM 8. A signed Borel measure 2(E) can be represented as the 
difference between two nonnegative Borel measures. 


Proof. Let 
WE) = ME) — u(E), 


SEC. 8.4 AXIOMATIC MEASURE THEORY  I6I 


where A(E) is the nonnegative Borel measure (5). Since A(E) and u(E) 
are countably additive, so is vE). Moreover, v(E) is nonnegative, since 
ME) > u(E). Therefore v(E) is a nonnegative Borel measure; and 


w(E) = ME) — V(E) (10) 
is a representation of the desired type. 


Remark. The representation (10) is not unique. In fact, let t(E) be any 
nonnegative Borel measure defined on YU. Then, besides (10), we can write 


u(E) = [A(E) + t(E)] — [W(E) + (E)] = ACE) — v,(E). (11) 


Moreover, by an obvious modification of the argument given in Sec. 4.7.2, 
(11) is the most general representation of u(E) as a difference between two 
nonnegative measures, and the measures A(Z) and wE) are the smallest 
possible among all that can figure in (11) [hence (10) is called the canonical 
representation of u, as on p. 83]. In particular, we have the formula 


v(E) = sup [—p(A)} (12) 
ACE 


(where A is u-measurable), since v plays the same role in the representation 
—u = v— das A plays in the representation w = A — v. 


The nonnegative measures A, v and p = A-+ v are called the positive 
variation, the negative variation and the total variation of wu, respectively 


(cf. Sec. 4.7.3). Let 4, ¥ and 6 denote the Lebesgue extensions of A, v and , 


constructed as in Sec. 8.2. Then every 6-summable set E is also A-summable 
and v-summable, by substantially the same argument as given in Sec. 5.3. 
This allows us to extend the measure u itself onto the family of all e-summable 
sets, by writing 

u(E) = WE) — wW(E). 


In the terminology of Secs. 8.1 and 8.2, @ is a finitely-Lebesgue extension 
of the Borel measure uw. In general, the need to avoid indeterminacies of 
the form oo — oo prevents &@ from being a Lebesgue extension. 

Next we define integration with respect to the signed measure wu (omitting 
the overbar for simplicity). A function 9(x) is said to be u-measurable if 
it is p-measurable, and yu-summable if it is p-summable. According to 
formula (4), p. 158, the o-integral of a p-summable nonnegative function 
o(x) is given by 

Ip = lim > nep{x: ne < g(x) < (n + 1)é}. 
1 


e70 n= 


The p-integral of the same function 9 is defined in the natural way: 


Lo = lim > neu{x: ne < (x) < (n+ Ie} = ko — I,e. 
e70 n=1 


162 AXIOMATIC MEASURE THEORY CHAP, 8 


Then, for a function of variable sign, we write 


Lea=1,97 —ho, 


as usual. It is obvious that the integrals /; and /,, have all the usual properties 
of the integral (apart from the fact that u is signed). 


8.5. Quasi-Volumes and Measure Theory 


Let o be a nonnegative quasi-volume defined on a dense set Q of sub- 
blocks B of a finite basic block B in Euclidean n-space. Since the set function 
c is additive on the semiring Q, it is natural to ask whether o can be extended 
to a (countably additive) Borel measure on some o-ring.* The theorems of 
Sec. 8.2 are not immediately applicable, since o is originally defined on a 
semiring instead of a ring and Is in general not countably additive. However, 
as shown in Sec. 5.8, if the quasi-volume o Is continuous, there exists a space 
L, of o-summable functions containing the characteristic functions of all 
blocks B < B such that /,y, - o(B). Let YU be the o-ring of all o-summable 
subsets E © B, equipped with the measure u(£) — /,4,. Then pw is the 
desired extension of the quasi-volume o. Note that Yf contains all the classical 
Borel sets in B, and is actually a o,-ring. The restriction to nonnegative 
quasi-volumes can be easily removed. In fact, a continuous quasi-volume 
of bounded variation can always be represented as the difference between 
two nonnegative quasi-volumes p and g, which are themselves continuous 
(see p. 100). Thus every continuous quasi-volume o of bounded variation 
can be extended to a signed Borel measure u. 

Conversely, let u(E) be a signed Borel measure defined on the o-ring of 
Borel subsets of an w-dimensional basic block B. Then, considered only 
on the blocks B < B, the measure wu is clearly a continuous quasi-volume 
of bounded variation (see Theorem 2, p. 101), which, just as in Sec. 4.7.3, 
can be decomposed into positive and negative variations 


p(B) = sup 3u(B,), (8) = sup} Sip}, (13) 


where the least upper bounds are taken with respect to all sets of disjoint 
subblocks B; © B(j=1,...,m). On the other hand, according to formulas 
(5) and (12), u gives rise to a pair of nonnegative Borel measures 


A(E) = sup w(A), vwE) = sup [—p(A)], (14) 
ACE ACE 
3The symbol o is used here in two different senses, but the context precludes any 


possibility of confusion. The present discussion is closely related to that at the beginning 
of Sec. 7.6. 


SEC. 8.6 AXIOMATIC MEASURE THEORY 163 


where the least upper bounds are taken with respect to all u-measurable 
subsets of E. To complete the correspondence between signed measures and 
quasi-volumes, we must still prove the consistency of the formulas (13) 
and (14): 


THEOREM 9. The relations 
\(B) = p(B), = WB) = q(B) 
hold for every block B — B. 
Proof. Clearly 
AB) supers) PUB), (15) 
since the least upper bound is taken with respect to a larger family of sets 


than in (13). On the other hand, as shown on p. 144, given any < > 0, 
there is a finite union of blocks F = B, U-:: U B,, such that 


v(E — EF) + v(F — EF) <e, 
where v(F) = p(E) + g(E) and E is any v-measurable set. But then 
\W(E — EF)| + |w(F — EF)| <«, 


since |u| < v. Clearly, there is no loss of generality in assuming that the 
sets B; are disjoint and contained in B. It follows that 


\(B) < sup > u(B,) + « = p(B) + «, 
which implies = 
AB) < p(B), (16) 
since ¢ is arbitrary. Comparing (15) and (16), we find that A(B) = p(B), 
and hence v(B) = q(B) also, as required. 


8.6. The Hahn Decomposition 


Besides the representation of a signed Borel measure u as the difference 
between two nonnegative measures, there is another representation of u 
involving the set X itself: 


THEOREM 10 (Hahn decomposition). Let U be a o-ring of subsets of X, 
equipped with a signed Borel measure wu. Then X is the union of two disjoint 
generalized Borel sets X*+ and X~ such that* 

u(E)>O0 iif EC X*, (17) 
u(E)<0 if EC X~ 
for every Borel set E. 


4 Generalized Borel sets in a o-ring UY are defined in the obvious way, i.e., as countable 
unions of sets £,, E,,...€ 2% not known to be contained in a fixed set Ey € YU. 


164 AXIOMATIC MEASURE THEORY CHAP. 8 


Proof. Assuming first that X € U, for every n = 1,2,..., we find 
a set EF, < X such that 


J 
where A is the positive variation of u. We then write 


x+-UfNE, x-=x—xt=N Ux — By), 


m=1n=1 m=l1n=1 


1.e., a point belongs to X' if it belongs to all the sets F,, starting from 
a certain value of n, and X~ is the complement of X*. Since 


ME,) > wE,) > 0X) — = 
MX — E,) =X) — NE) <2, 


: | 
VW(E,,) a ME,,) w(E,,) at MX) a u(E,, = rare) 


ee 
we have 
MX) < | U (X — E.)) < wee —E,)< ue 


ay oO 
n=m ee 


and hence A(X) = 0. On the other hand, for any m andn > m, 


( Ae.) < V(E,), 


and hence 


( Ae, = 0, UXT) < 3 »( Ae, 20Q, 


Therefore 
A(X-) = 0, v(X*) = 0, 


which is equivalent to (17). The representation X =: X* U X7 is called 
the Hahn decomposition (of X). 

Now suppose that X ¢ YW. In this case, X is the union of a sequence 
of sets X,, with finite measures u(X,,), where the X, can clearly be 
regarded as disjoint. For every #, let X73, and X73, be the sets figuring 
in the Hahn decomposition of X,, and consider the two generalized 
Borel sets 

Xt+=Uxt, xXx-=UXxX;. (18) 
n=1 


n=1 


If E < X+ isa Borel set, then 


E= UEX*,  w(E) = Dul(EXt) > 0, 
n=1 n=1 


SEC. 8.6 AXIOMATIC MEASURE THEORY 165 
while if E © X-, then 


E= UEX;, u(E) = Du(EX;) < 0. 
n=1 n=]! 
Thus ¥ = X' U X-, with X¥7 and X~ given by (18), is the required Hahn 
decomposition, and the proof is complete. 


THEOREM I1. Let U, uw, X* and X~ be the same as in Theorem 10. 
Then the positive, negative and total variations of u are given by 


ME) = WEX), VE) = —WEX) (19) 
and 
o(E) = sup Iw) (20) 


for every Borel set E, where the least upper bound in (20) is taken with 
respect to all finite unions F = A, U-+:: UA,, of disjoint sets A, € XU, 
A, © E. 


Proof. Recalling the formulas (14), we see that A(E) = u(£), 
v(E) = 0 for every Borel set E <— X", while (£) = 0, WE) = -u(E) 
for every Borel set E < X~. But any E © X can be written in the form 


E= EX+ VU EX-. 
Therefore 


ME) = MEX*) = W(EX*), WE) = WEX~) = —p(EX”), 
which agrees with (19). Moreover, since 


|u(A,)| = [ACA,) — W(A;z)| < ACA,) + WAx) = (AQ) 
we have 


> (ADI < > o(A,) < e(E), 


and hence 


m 


sup 2 |u(Ax) < 9(E). (21) 
On the other hand, as just shown, 
e(E) = AE) + WE) = w(EX*) — p(EX™) = [p(EX*)| + |WEX”)I, 
and hence obviously 


(E) < sup |a(Adh (22) 


Comparing (21) and (22), we obtain (20) as required. 


166 AXIOMATIC MEASURE THEORY CHAP. 8 


*8.7. The General Continuous Linear Functional on the 
Space C(X) 


We now extend the considerations of Sec. 5.4 to the case where the basic 
block B is replaced by a general compact metric space X, 1.e., a space X 
equipped with a metric o such that every infinite subset of X contains a 
sequence converging to a point in X.° Let u be a (signed) Borel measure 
defined on a o-ring YU! of subsets of X, containing X and all its open subsets, 
and let f(x) be continuous on X. Given any real c, the set {x: f(x) > c} 
is u-measurable (being open), and hence f(x) is itself u-measurable. Moreover, 
f(x) is u-measurable (being u-measurable and bounded). Therefore we can 
form the Lebesgue integral 


Lf = | f)udx) 


of any function f(x) continuous on X. Let C(X) be the normed linear space 
of all functions continuous on X, equipped with the norm 


fll = max | f(x)| 


xe xX 


(cf. p. 94). Then the integral 7, f defines a continuous linear functional on 
C(X), since it satisfies the following two conditions: 


1) If ff, fg are any two functions in C(X) and a, a are any two real 
numbers, then 


Lilorfy + ee fo) = ali fi + ely fr. 


2) If f,,€ C(X) is a sequence such that | f,,||-~0 as m-— oo, then 
[fm — 9, as follows at once from the estimate 


mn 
afl =| f  fn(de(4x) | < fll CX), 

where ¢ is the total variation of u (see p. 161). 

Next we prove the analogue of Theorem |, p. 95: 


THEOREM 12. Given a continuous linear functional If defined on the 
space C(X), there exists a Borel measure .. defined on a o-ring XU of subsets 
of X, containing X and all its open subsets, such that 


If = | f(xye(ax). (23) 


Proof. First suppose / is nonnegative, so that Jf > 0 if f(x) > 0. 
Then, choosing C(X) as the space of elementary functions and the 


> Here convergence of a sequence x,, to a limit x, means convergence of the ‘‘distances”’ 
0(Xm: Xo) to zero. 


SEC, 8.8 AXIOMATIC MEASURE THEORY 16/7 


functional J as the elementary integral, we can construct a space L; of 
J-summable functions. The only nontrivial part of this assertion, given 
the theory of Chap. 2, is to verify that J satisfies Axiom 3, p. 24. But 
according to Dini’s lemma (p. 54), which generalizes at once to the case 
of a compact metric space, f,, \ 0 implies f,, 20 uniformly, i.e., 
ll fm | — 0, and hence /f,, ~ 0. Now let be the class of /-measurable 
subsets of X. Clearly UW contains X and all its open subsets. In fact, if 
G < X is open, we have G = {x: o(x) > 0} where (x) is the distance 
between the point x € X and the set X¥ — G, a function which is easily 
seen to be continuous. The measure 


u(E) = Iyz(x), Eex 


is a Borel measure on Y (in fact, a Lebesgue measure, since Y is actually 
a o,-ring). Moreover u satisfies the relation (23), as required. 

Jf the functional / takes values of either sign, then, according to 
Riesz’s representation theorem (p. 44), we can represent / in the form 


l=J—N, 
where the linear functionals J and N are nonnegative and continuous 
in the sense that f,, 0 implies Jh,, > 0, Nh,, > 0. This time let 
be the class of K-measurable subsets of X, where K =J + N, and 
define the nonnegative measures 


ME) = Jyxx(x), WE) = Nyx), Eee 
and the signed measure 
p(E) = A(E) — WE). 
Then (23) continues to hold with this choice of u, and the proof is 
complete. 


*8.8. The Lebesgue-Stieltjes Integral on an 
Infinite-Dimensional Space 


In Chap. 5 we defined the Lebesgue-Stieltjes integral on a closed basic 


block 
X =(XK ER, Oy < Xp < Oy 0s 0 Oy << hy <0, } 


(denoted there by B) in Euclidean n-space R,. The set X can be regarded 
as the set of all real functions x, = x(k) which are defined at 7 points 
A = 1,...,n and for each k take values in the interval [a,, 5,]. If, for con- 
venience, we make a preliminary change of variables transforming each 
interval [a,, 6,] into the unit interval [0,1], the set X has the particularly 
simple form 

X= {xeER,:0<%x,<1,...,0< x, < l}. 


168 AXIOMATIC MEASURE THEORY CHAP. 8 


Suppose we now replace the finite set {1,...,} of values of the index k by 
an arbitrary infinite set T of values of the index ¢. Correspondingly, we 
regard X as the set of all real functions x(t) which are defined on 7 and 
for each t € T take values in the interval [0, 1]: 


X = {x(t): té€ 7,0 < x(t) < I}. 
Such a set X will be called the “infinite-dimensional cube,” the “7-dimen- 
sional cube,” the “Cartesian product of 7 closed intervals’’ or simply the 


‘*T-cube.’’ The aim of this section is to show how the Daniell scheme can 
be used to construct a theory of integration on X. 


8.8.1. Cylinder sets, blocks and quasi-volumes. Extensions and projections. 
In the case of an n-dimensional cube, the concepts of passage to the limit 
and continuity of functions can be defined in terms of the distance between 
points. In the case of a 7-dimensional cube, it is in general no longer possible 
to introduce a “‘natural distance.’’ However, we can still introduce a “‘natural 
topology,” i.e., given any < > 0, any integern > Oandanyv points f,..., ty; 
a set of the form 


U = U,, = {x(t) € X: |x(t,) — xo(t)| <¢,k =1,..., 0} 
is called a “neighborhood” of the “point” x,(1)+ X. The set X, equipped 


with this topology, is a topological space, in foct a compact Hausdorff space. 
It is clear that X is a Hausdorff space, since it is easily verified that 


a) Every point x € X has at least one neighborhood U,; 


b) If U,, is a neighborhood of x and x’ is a point in U,, then there is a 
neighborhood U,, of x’ contained in U,; 


c) If U, and Uj; are two neighborhoods of the same point x € X, then 
there is a neighborhood U* of x contained in the intersection U,U?; 


d) Given any two distinct points x, x’ € X, there are neighborhoods U, 
and U,, which are disjoint. 


The compactness of X follows from Tychonof/’s theorem, which asserts that 
a Cartesian product of compact spaces is compact in the product topology.® 

A set E< X is said to be a cylinder set if we can find an integer n > 0 
and points ¢,,..., ¢, € T such that 


E = {x(t) € T: (x(t), ..., x(t,)) € E*}, (24) 
where the set E”, called the base of the cylinder set E, is a subset of the 
n-dimensional cube’ 


AP SHE RAO <6, ey 0S ee 1}. 


® For the proof, see e.g., N. Dunford and J. T. Schwartz, Linear Operators, Part I: 
General Theory, Interscience Publishers, Inc., New York (1958), p. 32. 

7 To avoid confusion with functions x(t) € X, we use a different letter = — (2,,.... ae 
for the variable point in X”. 


SEC. 8.8 AXIOMATIC MEASURE THEORY 169 


In particular, the complement of the cylinder set (24) relative to the whole 
space X is itself a cylinder set: 


X — E = {x(t)€ X: (x(t), ..., x(t,)) € X" — E*}. 
Cylinder sets of the special form 
B= {x(t)e X: a, < x(t) < B;,..., a4, < x(t,) < 8,} (25) 


[where the inequality «, < x(t,) is replaced by the equality 0 < x(t,) if 
a; = 0] are called blocks. Since every block is specified by only a finite 
number of conditions, there is an integer N > O such that N conditions are 
enough to specify every block in a given finite family of blocks. It follows 
that the family of all blocks B © X is a semiring, and, by the same token, 
that the family of all cylinder sets E < X is a ring. 

By a guasi-volume defined on X we mean an additive real set function 
w(B) defined on every block B< X. The quasi-volumes considered here 
will be assumed to have two further properties: 


1) Nonnegativity and boundedness. For every B— X, 
w(B) > 0, 
and moreover w(X) < ©. 


2) Continuity on the empty set in every n-dimensional cube (cf. p. 100). 


If B, > B, > +--+ is a sequence of blocks defined by the same fixed 
set of coordinates ¢,,...,¢, € T, with an empty intersection 
1B, = 2, 
m=] 
then 
lim w(B,,,) = 0. 


It should be emphasized that «(B) depends only on the block B itself, and 
not on the particular form (25) in which B is written. Thus the quasi-volume 
w must satisfy the following ‘“‘consistency condition”: 


Oxo) = X(t) < Bypivdg ey K,) = Be} 


= w{x: ay < x(t) < Bi... 5 Oe < X(tn) < Bas O < x(trai) < 1}. oa 
Given a quasi-volume w defined on all blocks (25), where n and t,,..., ¢, 
are arbitrary, we can use the formula 
Oi ee ae oe Gate ie ee ac rarer ee ee aC 
= w{x eX: 0, < x(t,) < B,..., 0, < x(t,) < B,} ep) 
to define a quasi-volume w’!'''-- on every #-dimensional cube X”. In 


particular, 
wits (X") = w(X), (28) 


170 AXIOMATIC MEASURE THEORY CHAP, 8 


and the consistency condition now takes the form 


@! ce ak ME Gok, gE, Gp a Day hash Oe ee Cie Oat 
= ss eens (ew, Gaae. 2 < a < B,, rey hy ae < Bas 0< Ent < 1}. 


It follows from Condition 2 ihat every quasi-volume wt'-**’ ‘nx ig continuous 
in the sense of Sec. 5.6. 
Conversely, let 
wits’  (n,t,...,¢, arbitrary) 


be a family of continuous nonnegative quasi-volumes, which are defined on 
every n-dimensional cube X” and satisfy the consistency condition (26). Then 
we can define a quasi-volume w on the set X by the simple expedient of 
reading (27) from right to left. 

A key concept in the present theory is that of a function f(x), x € X which 
effectively depends on only a finite number of ‘‘coordinates” x(f,),..., X(ty), 
i.e., whose value remains the same if any of the coordinates x(t), f44,...,¢, 
is changed. With every such function f(x) we can associate a function 


FS) =fEn -- +s bn) =f Eh), - - - 5 x(tn)] (29) 
defined on the n-dimensional cube X”. This correspondence between f(x) and 
J(&) is one-to-one, in the sense that given any function f(&) defined on X” 
andanyn points /,,...,f, (nm arbitrary), we can define a function /(x) depending 
only on the coordinates x(f,),..., x(t,,), by merely reading (29) from right to 
left. The function f(x) will be called the extension of f() from X” onto X, 
while the function f(%) will be called the projection of f(x) from X onto X”. 


Lemma |. If {(&) is continuous on X”, then f(x) is continuous on X, 
and conversely. 


Proof. If f(€) is continuous at & = (&9,..., 6°) eX”, then, given 
any ¢ > 0, there is a § such that 
IE, — BI < 8 Cee ree.) 
implies 
I/(S) — fol = If) — foro) < «, 
where Xo 1s any point in X such that x(¢,) — ¢, (kA = 1,..., 2”). In other 
words, | f(x) — f(Xo)| < < provided that x belongs to the neighborhood 
{x(t) € X: |x(t,) — x(t) < 3,k - 1,..., a} 
(recall the “natural topology” of p. 168). The converse follows by 
detailed reversal of steps. The fact that 
X = {x(t) eX: x(4) = &,..., x(t,) = Ey, E |X}, 
XP SACE KX” EX) 4c © = Xt) te x} 
is used to go from point continuity to continuity on the sets ¥ and X”. 


SEC, 8.8 AXIOMATIC MEASURE THEORY {71 


8.8.2. Construction of the space L (X). Kolmogorov’s theorem. We are 
now ready to use the Daniell scheme to construct a theory of Lebesgue- 
Stieltjes integration on X, with respect to a given quasi-volume w. As our 
space of elementary functions we take the set H of all continuous functions 
on X which depend only on a finite number of coordinates. This choice of H 
clearly satisfies Axioms a and b, p. 23. In fact, Axiom b is obvious, while 
to verify Axiom a, we need only note that if f(x) and g(x) are two functions 
in H, the first depending on coordinates x(t,),..., x(t,) and the second on 
coordinates x(t;),..., x(t’), then the linear combination «f(x) + Bg(x), 
where a and 6 are arbitrary real numbers, also depends on only finitely 
many coordinates, more exactly on the coordinates x(t/),..., x(¢%) where® 


thigmaagt. Veo sew esday tints he ls 


As in Sec. 5.1, a natural choice of the elementary integral on H is the 
finite-dimensional Riemann-Stieltjes integral? 


7 


haf) ef Crcarc are) (30) 


which can also be written in the form 


Taf =| puss f (Eas «+5 Byeotter (dE dE), (31) 
because of the consistency condition (26). However, before we can invoke 
the general Daniell scheme of Chap. 2, we must first prove 


THEOREM 13. The integral (30) has all the properties of an elementary 
integral. 


Proof. Axiom 2, p. 24 is obvious, ie., if f(x) > 0, then I, f > 0, 
but it takes a little more thought to verify Axioms | and 3. Let f(x), 
e(x) € H be the same as in the verification of Axiom a above, and let 
a, 8 be arbitrary real numbers. Then, because of the obvious generaliza- 
tion of (31),?° 


(af + Bg) =| dafE”) + Beat (ae) 
= al fe ot “(d2") + Bf. eat (d2") 
= a]. fot (de) + Bf geet (dE) 
= al f+ Blog, 


® Note that in generals # n + r, since the sets {f,,..., fa} and {t;,..., t;} may intersect. 
® The existence of (30) follows from Lemma 1 and Theorem 1, p. 66. 
10 Here, of course, & = (Ej, wans bay 6 H (EG eve S,) and & = Ei pends ED: 


172 AXIOMATIC MEASURE THEORY CHAP. 8 


which proves Axiom 1. As for Axiom 3, we start from the estimate 


sfml << wi ™(X") max | fmn(&)| = o(X) max |f,,(%)| 


we Xx 


implied by (30) and the corresponding estimate for finite-dimensional 
Riemann-Stieltjes integrals [formula (12), p. 68]. But according to 
Dint’s lemma (p. 54), which generalizes at once to the case of a compact 
topological space, f,, \ 0 implies f,, ~ 0 uniformly, ice., 


max |f(x)| — 0, 


ze. 
and hence /f,, — 0, as required. 


Thus all the prerequisites for constructing a theory of the integral, based 
on continuous functions depending only on finitely many coordinates 
x(t), ..., X(¢,,) as elementary functions, with the integral (30) as elementary 
integral, are satisfied. This leads to a space L,,(X) of w-summable functions 
on X, and a corresponding o,,-ring of w-summable sets. In fact, we have 
arrived at the following celebrated result, proved by Kolmogorov in 1933: 


THEOREM 14 (Kolmogorot’s theorem). Let X be the infinite-dimen- 
sional cube, and let w be a quasi-volume defined on all blocks BC X, 
such that every quasi-volume w'!'*+' is continuous on the corresponding 
finite-dimensional cube X”. Then w can be extended to a Lebesgue measure 
on ao, -ring of subsets of X. 


Proof. We need only show that the measure w(F) - J,,y, defined 
on the o,,-ring of w-measurable subsets of X reduces to the original 
quasi-volume «, 1.e., that 


o( B) i= wtp) 


for every block B < X. Suppose B is specified by 7 conditions involving 
the coordinates x(7,),..... ene ea Oo <1 ae eae t, be the Lebesgue-Stieltjes 
integral for the space of w’!---:/"-summable functions, constructed via 
the Daniel! scheme, starting from the space of continuous functions on 
X” equipped with the Riemann-Stieltjes integral with respect to w/t --- os 
Moreover, let /,,(x) be a sequence of continuous functions on X” con- 
verging everywhere to the characteristic function 7 ,(x) of the base B” 
of the block B. Then, invoking the continuity of w't--:'" on X” (cf. Sec. 
5.6), we have 


w(B) = wt"! "(B21 iccsctgh pc) = lime piew: tnSm(S) 


Man 


= lim I. f(x) = 1.Xp(%), 


mos. 


and the proof is complete. 


SEC. 8.8 AXIOMATIC MEASURE THEORY 173 


Remark. Theorem 14 has an obvious generalization to the case of a 
signed quasi-volume o, provided that the total variations of w"! 
all bounded by a fixed constant, i.e., that 

Vi jtie s+ sta X) = M 


for arbitrary 7, t,,...,¢t,. 


THEOREM 15. The integral of a function f(x) € L,,-X) can be expressed 
as the limit of a sequence of finite-dimensional integrals, i.e., 


J, f@e(dx) ee [ enn fErs «+ Eng eotte ss tem(de). (32) 


Proof. The set of elementary functions H, 1.e., the set of all con- 
tinuous functions on X depending on only finitely many coordinates, 
is dense in L,,(X) [see p. 40], thereby implying (32). 


COROLLARY. The set of polynomials with (all possible) arguments 
ty,...,¢, is dense in L,(X). 


Proof. Any function in H can be uniformly approximated by 
polynomials (Weierstrass’ theorem). But H is dense in L,,(X), as just 
noted. 


8.8.3. Structure of w-measurable sets and functions. As shown on p. 144, 
the semiring of all blocks B" < X” is completely sufficient. This result is 
easily generalized to the infinite-dimensional case: 


THEOREM 16. The semiring LU of all blocks B < X is completely 
sufficient. 


Proof. Given any elementary function f(x), let /(&) be its projection 
from X onto X”. Then there is a sequence of step functions 


h,,(é) = > tymdgt (E) 


converging to f(&), where the B," are blocks in X”. Therefore the sequence 


m 


h, (x) = > Opmk Bum X)s Bim EU, 
k=1 


where B,,,, is the block with base By, 
Q is sufficient (cf. Sec. 7.3). 

Next let Z < X be a set of w-measure zero. Then, given any < > 0, 
there exists a nondecreasing sequence of nonnegative elementary func- 
tions f,,(x) such that /.,f,, << and supf,,(x) > 1 on X. Consider the 
Sets 


converges to f(x). This shows that 


Ep lxi p(X) af On SS Ty 2 peu): 


174 AXIOMATIC MEASURE THEORY CHAP. 8 


Clearly E, © E, < --+, and every E,, is a cylinder set. The base E7, 
of E,, is open in X”, being the preimage of an open set under a continuous 
mapping. Therefore E>, can be represented as a countable union of blocks 
in X", Le., 


E; -U By. km 


(cf. p. 145). By the same token, E,, is open in X and can be represented 
as a countable union of blocks in X. In fact, 


Ey aad U Bim 
k=1 
where B,,,, is the block with base B”. But obviously 
Zo UE,, U U Bim 
m--1 m lk-l 


and moreover, according to the corollary on p. 118, 


o( U U UB... lim o(U 8.) = = lim w(E,,) < 2¢, 


m=1k=1 ma m— oo 
since J, f,, < ¢implies w(£,,,) < 2. Therefore Z is covered by countably 
many blocks 8&,,,, whose union has arbitrarily small «-measure. In 
other words, YW is completely sufficient (cf. Sec. 7.4), and the theorem ts 
proved. 


LEMMA 2. Given any block 
B= {x(t)e X: a, < x(t;) < By,..., a4, < x(t,) < B,} 
and any ¢ > 0, there is a block B’ such that every point of B is an interior 


point of B’ and 
w(B’) < w(B) + «. (33) 


Proof. If B’ is of the form 


= {x(t)e X: a, < x(t) < By,...,04, < x(t,) < Bi}, 
where 6, = 6, if 6, = land, > 6, if8, << 1(k = 1,..., x), then every 
point of B is an interior point of B’. [Note that the intersection of B’ 
with any sheet x(f,) = 0 or x(t,) = 1 consists entirely of interior points 
of B’ (relative to the cube Y).] Moreover, 
BOBCBU {x(teX: 8B, < x(t) < Bi} Us: 

U {x() EX: 8B, < x(t) < Br}, 

and hence 
o(B) < o(B’) < o(B) + w{x(t) € X: By < x(t) < By} +> 


+ w{x(t)eX: 8, < x(t,) < Bi}. oY 


SEC. 8.8 AXIOMATIC MEASURE THEORY 175 


According to Condition 2, p. 169, 
lim w{x(ex: 8, << x(t, <6} =O (k=1,...,n), 
(By ~ B 


i.e., given any <« > 0, 
w{x()EX:B, <x) <B <= (k=1,...,n), 
n 
provided that @. is sufficiently close to B,. But with such a choice of the 
B,, (34) implies (33), as required. 


Our next result concerns the relation between w-measurability and 
Gy'trs +s ‘n-measurability : 


THEOREM 17. The cylinder set E < X is w-measurable' if and only 


if its base EX < X" is w+! ‘n-measurable, and then 
4 Gy tees "(E") = w(E). 
Proof. First we note that if Z" © X” has w't':* ‘n-measure Zero, 


then the cylinder set Z < X with base Z” has w-measure zero. In fact, 
given any < > 0, there exists a nondecreasing sequence of nonnegative 
functions f“)(€) continuous on X” such that 


Lot me fOE) <2, sup f,(@)> 1 on Z”. 


x eee 


Let f(x) be the extension of f)(€) from X” onto X. Then f\*)(x) is a 


re mr me 


nondecreasing sequence of nonnegative elementary functions such that 


Lofm(x)<e, supfim(x) > 1 on Z, 


and hence Z has w-measure zero, as asserted. 

Next let E” ¢ X” be any w'"-*' ‘n-measurable set. Then there exists 
a sequence of functions f,,(&) continuous on X” converging on a set of 
full woes ‘n-measure to the characteristic function y ~n(&). Let f,,,(x) be 
the extension of f,,(&) from X” onto X. Then f,,(x) is a sequence of 
elementary functions converging on a set of full w-measure (as just 
shown) to the characteristic function y (x), and hence E is w-measurable. 
Moreover, the functions /,,(&) and f,,(x) can be regarded as bounded, 
and hence, by Lebesgue’s theorem, 


wlth CE") = lim Toa, taf) = lim 1, fn) = @(E), 


m—> a2 


and half of the theorem is proved. 


11 And hence w-summable, since w(X) < oo. 


176 AXIOMATIC MEASURE THEORY CHAP. 8 


The converse is more difficult. Suppose the (arbitrary) cylinder set 


Ec X with base E” © X” is w-measurable. To show that E” is wo: *: tne 
measurable, it will be enough to prove that 

o*2! aan tn(E") < w(E), (35) 
where w*!t-:-> 'n(E”) denotes the outer measure of E” (see Sec. 7.5). 


In fact, if (35) holds, then 
a *!! nee inc” —_ E”) < w(X ae E), 
since X — F is a cylinder set with base X” — E”, and hence 


wt Maer '(E”) + gt! Relea’ LO a E”) < w(E) - w(X aes E) 
a w(X) _ ow"! re tn(X") 


[cf. (28)], which, according to Theorem 6, p. 142, implies that E” is 
Ge t*n-measurable.}? 

Thus we now concentrate our attention on proving the inequality 
(35). Given any < > 0, Theorem 16 guarantees the existence of a count- 
able set of blocks B,, such that 


oO 


Ec U Bins > w(B,,) < @(E) + e. 


in general, there are points of E which are not interior points of B, U 
B,U+++, a fact which would prevent subsequent use of the finite 
subcovering lemma (p. 13). To avoid this difficulty, we use Lemma 2 
to replace every block B,, by a block B) such that every point of B,, 
is an interior point of B,, and 


1.6) 


> o(B;,) < @(E) + 2e. 


m=l1 


To keep the notation simple, we imagine that this replacement has already 
been made, and henceforth omit primes. Every block B,, is specified 
by a finite number of conditions involving the coordinates x(t,). These 
conditions can always be regarded as including conditions on the co- 
ordinates x(f,),..., x(¢,) defining the cube X”, if we allow redundant 
inequalities of the form 0 < x(t,) < 1,4 = 1,...,n. Thus every block 


Bo = {xQexs tp Kh) < Bien Be — OO) < Cuda} 
12 Equation (16), p. 142 can also be written in the form 


L*(E) + p*(CE) < CX), 


since the inequality is in any event impossible. 


SEC. 8.8 AXIOMATIC MEASURE THEORY 1/77 


has a corresponding “projection” 
Br, = {6 eX": a, << &, < B,...,0,<&, < B,} 

onto the cube X ae 

Now let & = (),...,%,) be a fixed point in E”, and consider the set 
O(E) of all x(t) € X such that 

x(t,)=§  (k=1,...,n). 

Since Q(&) is a compact subset of X, it has the finite subcovering property. 
Therefore from the blocks B,, B.,... covering Q(&) we can select a 
finite set B,,..., B, covering Q(&), with projections B”,..., B” onto 


X", Clearly, € is an interior point of the block 


T 


Br = Be, 
k=1 
and also of a subblock Br = B” whose boundary sheets all have rational 
coordinates. Let B,,..., B. be the subblocks of B,,..., B. whose 
projections onto X” coincide with B”, but whose coordinates other than 
x(t,),..., x(t,) are the same as those of 8,,..., B,. Then the union 
Tr 
U B, 
k=l 


is a cylinder set (in fact, a block) with base &”. Carrying out the same 
procedure for every point & € E”, we succeed in covering the cylinder 
set E by a cylinder set G, which is a union of blocks with “rational 
projections” onto X”. Therefore G is a countable union of blocks, and 
hence an w-measurable (Borel) set in X. Moreover 


w(G) < ¥ w(B,) < w(E) + 2¢, 


since every point of G belongs to one of the original blocks B,,. Let G” 
be the (Borel) base of G. By the first half of the theorem, 
wet boose "(G") = w(G). 
But then 
aw *!t peeus *n(E") < ow" peeer (G") = w(G) < w(E) ae 2e, 
by the definition of outer measure (E © G). Taking the limit as « — 0, 
we obtain (35), and the theorem is proved. 


COROLLARY |. A function f(x), x € X depending on only a finite 
number of coordinates x(t,),..., X(t,) is w-measurable if and only if its 
projection f(&) from X to X" is w''---+ '"-measurable, and then 


w{x(t) Ee X: f(x) > ch = a'r NECK". I (Eyweea Gq) = C} (36) 


for arbitrary real c. 


178 AXIOMATIC MEASURE THEORY CHAP. 8 


Proof. According to Theorem 5, p. 120, w-measurability of f(x) is 
equivalent to w-measurability of every set 


{x(t) € X: f(x) > c}, (37) 
while w't: °° 'n-measurability of f(&) is equivalent to w't':--: 'n-measura- 
bility of every set 

{Ee X": f(&) > c}. (38) 


But, because of the nature of f(x), (37) is a cylinder set with base (38), 
and the rest of the proof follows from Theorem 17. 


COROLLARY 2. A function f(x), x € X depending on only a finite 
number of coordinates x(t,),..., X(t,,) is w-summable if and only if its 
projection f(&) from X to X” is w+: ‘n-summable, and then 


[. Fede(dx) = | af Ess Bots (d8). 


Proof. We need only use (36), recalling the construction of the 
Lebesgue integral given in Secs. 6.6 and 8.3. 


PROBLEMS 


1 (‘Explicit construction of a nonmeasurable set”). On the unit square 
X = {x:0 < x, < 1,0 < x, < 1} consider the ring of sets E = E, x (0, 1], 
where E, © [0,1] is Lebesgue-measurable with measure m(E,). Then write 
u(E) = m(E,). Show that the set (0, 1] x [0, 4] is y-nonmeasurable. 


Comment. Despite this problem, it is difficult to give an explicit construction 
of a nonmeasurable set in a general Hausdorff space all of whose open sets 
are measurable. 


2. Let u be a Lebesgue measure on the real line, and let Y < X be a nonmeasur- 
able set. Then p,(Y) -- x, u*(Y) — 8, where « < 6. Construct a Lebesgue 
extension v of the measure y» in which the set Y is v-measurable with a given 
measure y = W(Y),a < _ < 6. 


Hint. Ifu*(X) — B, u*(Y) — 0, the u-measurable sets A and B are uniquely 
determined (to within sets of measure zero) by the set C = AY U BUX — Y). 
Let YI (the domain of v) consist of all such sets C, equipped with the measure 


7 
x) u(B). 


In the general case, there are .-measurable sets E, and EF, suchthat Ey ¢ Y © E,, 
u(E,) «, u(E) -- 3. Then apply the same construction to the difference 
E=E, — E,,1.e., let & include all subsets of the set E of the form 


C=A(Y — £,) Vv BIE, — Y), ACE, BCE, 


W(C) = 5 A ( x 


PROBLEMS AXIOMATIC MEASURE THEORY 179 


where A, B are u-measurable and 


yo— m y =o 
WO) =F ea) + (1 - =) ow, 
3 — Bp — 
Comment. The process of constructing Lebesgue extensions can be con- 
tinued by adjunction of further nonmeasurable sets Y:, Y3,..., besides Y = Yj. 


This naturally raises the question of whether it is possible, by using transfinite 
induction, to extend the measure , as a countably additive set function, onto 
the family of all subsets of X. An extension of u is indeed possible, but with 
preservation of finite additivity only; countable additivity is lost on passage 
to the first uncountable ordinal. In fact, there does not exist a countably 
additive set function defined on all subsets of the interval [0, 1] equal to zero 
on every set containing only one point.4 


3. Let u be an additive set function defined on a semiring YU (u need not be 
nonnegative). Prove that » can always be extended to an additive set function 
on some ring 8 containing Y. 


Hint. Let 8 consist of all finite unions of disjoint sets of YU, and then 
define 
u(B) = > u(A,,) for every B= U A, € 8. 


Verify the uniqueness and additivity of u. 


4. Suppose a countably additive nonnegative measure » is extended from a 
semiring to a ring 8, as in the preceding problem. Prove that the countable 
additivity of u is preserved. 


Hint. A sequence B,, Bo, ... of disjoint sets of 8 can be written as 
By =A, VU-:- VA), By = Agi, Ut s VAD ees 
in terms of disjoint sets of U. 


5. Show that the three equivalent characterizations of countable additivity 
for a ring, given by Properties a, b and c of Sec. 8.1, are not equivalent for a 
semiring. 


Hint. Consider the semiring consisting of all sets of rational numbers 
in the interval [0,1] satisfying all possible inequalities of the form 
a<x<Bacx<Ba<x< Bia <x < $B, where « = 8 is not excluded, 
and introduce the measure u(A) - 8 a. Property a is not satisfied (the measure 
is additive, but not countably additive), although Property b holds. 


6. Specialize the considerations of Sec. 8.5 to the case 1 = }. 
Hint. Cf. Sec. 4.7.4 and the remark on p. 105. 
‘4 This was shown with the use of the continuum hypothesis by S. Banach and C. 


Kuratowski, Sur une généralisation du probleme de la mesure, Fundamenta Mathematicae, 
14, 127 (1929). 


180 AXIOMATIC MEASURE THEORY CHAP, 8 


7.1° Let X be an uncountably infinite set, and let YM be the ring consisting of 
all finite subsets of X and their complements. For finite E <_X, let u(E) equal 
the number of elements in £, and otherwise let p(X — E) = —u(E). Then the 
measure ,4(E) is countably additive on Y, but its total variation does not exist. 
Why is this compatible with the results of Sec. 8.4? 


Hint. Wis a ring, but not a o-ring. 


15 Due to O. G. Smolyanov. 


Part 4 


THE DERIVATIVE 


9 


MEASURE AND SET FUNCTIONS 


4 


9.1. Classification of Set Functions. Decomposition into 
Continuous and Discrete Components 


Let YU be a o-ring of subsets of X, equipped with a nonnegative Borel 
measure uw. If X €U, we assume that ¥ = X¥, U X,U-::, where X, < 
X,<— ::*: and every X, €%. Besides the measure 2, we shall consider 
other countably additive (finite) set functions P(E), E € YU, defined on the 
same o-ring, where ®(£) is in general signed (1.e., of variable sign). According 
to Secs. 8.2 and 8.3, the Borel measure u can be extended from the s-ring 
YW to a Lebesgue measure (which we continue to denote by 2) on a a,-ring 
%{ > %, which can then be used to construct a Space of j4z-summable functions. 
In general, however, the function P(E) cannot be extended onto the o,-ring 
1.1 In any event, as we know from Sec. 8.4, P(E) can be decomposed into 
positive and negative variations @(E) and Q(£), defined on the original 
o-ring U, ie., D(E) = P(E) — QO(E), Ee XU. 

With a view to classifying set functions, we now introduce some new 
terminology, where in each of the following definitions, ®(E) denotes a 
countably additive set function, defined on the Borel sets £ of some o-ring : 


1) D(E) is said to be concentrated on a set Ey if it is defined and equal 
to zero on every Borel set E < X¥ — Ey. If D(E) is concentrated on a 


1 According to p. 161, the function P(E) can be extended to a signed finitely-Lebesgue 
measure on some o-ring Y*, but the s,,-ring Yt may not be contained in Y* (see Prob. 10, 
p. 204). 


183 


184 MEASURE AND SET FUNCTIONS CHAP. 9 


set Ey, then so are its positive variation P(E), negative variation Q(£) 
and total variation V(E) = P(E) — Q(E), since if O(E) vanishes on 
every Borel set E — X — Ep, then, as in Sec. 8.4, 


P(E) = sup ®(A)=0, Q(E) = sup [—9(A)] = 0, 
ACE ACE 


V(E)= P(E) + QE) (Ae U). 


2) M(E) is said to be continuous if it is defined and equal to zero on 
every set E of measure zero containing a single point. 

3) M(E) is said to be absolutely continuous if it is defined and equal to 
zero on every set E of measure zero. 

4) D(E) is said to be singular if it is concentrated on a set E, of measure 
zero. 

5) M(E) is said to be discrete if it is concentrated on a set Ey of measure 
zero containing no more than countably many points. 


Next we deduce some simple consequences of these definitions and our 
previous considerations: 


1) Every absolutely continuous set function is continuous. 

2) Every discrete set function is singular. 

3) The underlying Borel measure u(E) is absolutely continuous. 

4) If D(E) is absolutely continuous, so are its positive variation P(E), 
negative variation Q(E) and total variation V(E) = P(E) + Q(E). 

5) If D(E) is absolutely continuous and singular, then O(E) = 0. 

6) If g(x) is a w-summable function, then 


(E) = J e(x)p(dx) 
is an absolutely continuous set function. 


There exist continuous set functions that are not absolutely continuous. 
For example, let X be the square 0 < x < 1, 0 <y<_1, equipped with 
ordinary two-dimensional Lebesgue measure, and let M(£) be the ordinary 
one-dimensional Lebesgue measure of the intersection of E with the interval 
0< x < 1. Then ®(£) is continuous and singular, but not absolutely con- 
tinuous. Some less trivial examples are given in Probs. 4 and 6, pp. 203-204. 

If the set function ®(£) is discrete, then 


ME)= > &m (EEX), (1) 
CmE EK 
where ¢,....C,,... 18 a sequence of points in X of zero measure and 
215-++58m) +++ 1S a Corresponding sequence of real numbers such that 


¥ len = emi < 2 (2) 


m= 


SEC. 9.1 MEASURE AND SET FUNCTIONS [185 


if xX eU.2 If X ¢ A, in which case ¥ = X, UX, U-:+, where X¥,C X¥,¢ °°: 
and every X,, € U, we replace (2) by the condition that 


> eam 2s). 


cme Xn 


The positive, negative and total variations P(E), Q(E) and V(E) of a discrete 
set function are easily found. In fact, 


P(E)= > 8m OE)= Dd lanl, VIED = > leml- 
CmEE E E 


CmELL CmE 
Im> 0 Im <0 


THEOREM |. Let YU be a o-ring of subsets of X, equipped with a 
nonnegative Borel measure w(E) and a countably additive set function 
@(E). Then D(E) can be represented in the form 


O(E) = C(E) + D(E), (3) 
where C(E) is continuous and D(E) is discrete. 


Proof. Because of Theorem 8, p. 160, there is no loss of generality 
in assuming that O(£) is nonnegative. Since, given any < > 0, no Borel 
set contains more than a finite number of points such that 


w({x}) =0,  O({x}) > «, 


the whole set X can contain no more than countably many such points. 
Choosing 


we see that the set of all x such that 
u({x}) = 0,  D({x}) > 0 


contains no more than countably many points. Let these points be 
Cy,..+5Cmy--+ Lhen, given any £ € YU, the set function 


D(E) = X Pcn) < OE) < 0 


CmEL 


is obviously discrete and finite. By construction, the function 
C(E) = O(£) — D(E) 


takes the value zero on any set {x} of measure zero, and hence is con- 
tinuous, thereby implying the representation (3). 


2 As on pp. 91, 146,(1) means the sum of all the numbers g,, such that the corresponding 
points c, belong to the set E. Note that D({cm}) = gm. 


186 MEASURE AND SET FUNCTIONS CHAP. 9 


According to the Hahn decomposition (Theorem 10, p. 163), if ®(E) is a 
countably additive set function defined on a o-ring YW of subsets of 1, then 
X is the union of two generalized Borel sets X* and X~ such that 

@M(E) > 0 if Ec X-*, 
O(E) < 0 if Eo X- 
for every E €¢ U. We now prove a related result: 

THEOREM 2. Let YU be a o-ring of subsets of X, equipped with a non- 
negative Borel measure w(E£) and a nonnegative countably additive set 
function ®(E). Then X is the union Z) UE, U E, U-*> of a sequence 
of disjoint generalized Borel sets Zy, Ey, E,,... such that 

a(n — 1)u(E) < DE) < any(E) (4) 
iffE CE, EE, while 
u(Z) = 0 
ifZ—Zy, ZEN. 
Proof. Since ®(£) and yw(E) are countably additive, so is the set 


function 
®(E) = O(E) — anu(E) (C2 ey ares 


Let X — X¥7 U X7 be the Hahn decomposition of ®,. Then ®(E) > 
anw(E)ifE © X> and O(E) < Oif E < X;. Thus (4) holds on any subset 
of the set G, = X}_,X7. Clearly 

G, = XtXy7 = XX7 = Xz, 

Gp = XPXqz = (X — Xp)X_ = Xp — XTX]... 

G,, mas Xn-1X n =o, (X _ X7-wX 7, =X,- Ania sey 
and moreover, the complement of G = G, UG, Ur: = X, UX; Us: 
(relative to X) is 


Z=X—-—UXZ= N(x — X,) = 1 XI. 
If Z <— Zo, ZEU, then Z < X* for every n, and hence 
@M(Z) > anuv(Z) Ae le2) s-3), 

which implies u(Z) — 0. The sets G,, will in general intersect. However, 
the sets 

Eve Xe = Gi, E, = X, — X,X_ = Gy, ..-., 

Ee, = X » _ XX, ee XX SC Soa 
are disjoint, and (4) holds on E,. Moreover, the £,, have the same union 
G as the G,,, and hence lead to the same set Z). This completes the proof. 


SEC, 9.2 MEASURE AND SET FUNCTIONS 187 


9.2. Decomposition of a Continuous Set Function into 
Absolutely Continuous and Singular Components. 


The Radon-Nikodym Theorem 


Next we show how to carry the decomposition (3) even further: 


THEOREM 3. Let YU be a o-ring of subsets of X, equipped with a non- 
negative Borel measure p(E) and a countably additive set function P(E). 
Then O(E) can be represented in the form 


O(E) = A(E) + S(E) + D(E), (4) 


where A(E) is absolutely continuous, S(E£) is continuous and singular, 
and D(E) is discrete. Moreover, A(E) itself has the representation 


A(E) =|. f(x)w(dx), (5) 


7 
where the function f(x) is u-summable on every Borel set E. The repre- 
sentation (4) is unique, and so is f(x), to within a set of measure zero. 


Proof.? Again there is no loss of generality in assuming that O(£) 
is nonnegative. Moreover, because of Theorem 1, we can assume that 
@(E) is continuous, provided a discrete component D(E) is added to 
the final answer. According to Theorem 2, for every m = 1, 2,... there 
is a decomposition 


X=Z™ VE™M VEM™ U--- 


of the set X into disjoint generalized Borel sets Z(™, Ey’, £9”, ... 
(recall footnote 4, p. 163), such that 


n n 
PA 
where E is an arbitrary Borel set contained in £“”” and Z is an arbitrary 
Borel set contained in Z“”. Moreover, if 


a w(E) < D(E) < — u(E), u(Z) = 0, 


Lo a U Zz", 
m=1 
then any Borel set Z contained in Z, also has measure zero. Consider 
the function 


io2 


n 


a for WEE”. 1 WH 1, 2.ce<5 (6) 
3 Following S. Saks, Theory of the Integral (translated by L. C. Young, with two notes 


by S. Banach), second revised edition, Dover Publications, Inc., New York (1964), Chap. 1, 
Sec. 14. 


188 MEASURE AND SET FUNCTIONS CHAP, 9 


defined everywhere on X except on the set Z‘”). Given any Borel set E, 
we have 


ES EL OO ER OEE ese, 
and hence 


O(E) = O(EZ™) + SOE”) > 2 WEE") = J In UdD), 
n=1 =1 7 
(7) 
which, in particular, implies the u-summability of f,,(x) on E. We also 
have 


OE) < (EZ) + 3 WEES”) . 
n=1 8 
= &(E2Z,) +, Fncdu(dx) + = ul). 


We now estimate the difference between the functions f,(x) and 
finvr(x). Since ELM E+ is contained in both E%™ and EL"), 


n—t1 n 
~ HE) < P(E) < — ul), 
2 2 
k—1 k 
ae u(E) < O(E) < er u(E) 
for every Borel set E < EX™E'™)), and hence 
k—1 n n—I1 k 
eae iE < Eeoe E ) E <= E 5 
amet u(E) 5m u(E) ym u(E) aml u(E) 


Therefore, if u(E) > 0, we have k —1 < 2n and 2(n— 1) < k, or 
equivalently, 2n — 2 < k < 2n 4-1. Clearly, u(£) = 0 for other values 
of k. It follows that 


(m) + ( ( ( 
EP Sr Ure w Ey VO”. oO =; 


n 


and hence, at every point of £}”"’, except possibly on a set of measure 
zero, fi,41(x) takes values from 


2n—3 n—Il 1 2n n—l 


qmt+l . > am qmti ° amt am am 
because of (6). Therefore the inequality 
1 
|fm( Xx) — fimsi(X)| < am 


holds everywhere on E‘”, except possibly on a set of measure zero. 
But then the functions f,,(x) form a sequence converging uniformly 
(almost everywhere on X) to a limit function 


F(x) = lim f(x). 


m+ © 


SEC. 9.2 MEASURE AND SET FUNCTIONS 189 


Since, as shown above, f,,(x) is u-summable on every Borel set E, the 
same is true of f(x), by Fatou’s lemma (p. 37). Taking the limit as 
m —> oo in the inequalities (7) and (8), we obtain 


J, So)w(ds) < OE)< O(EZ) +], fax). 9) 
Now let 


O(E) =|, f()u(dx) + S(E) = A(E) + SEB), 


where the set functions S(£) and A(£) are obviously countably additive. 
Because of (9), S(E) is nonnegative and satisfies the inequality S(E) < 
@(EZ,). Therefore S(E) is concentrated on the set Z, of measure zero, 
and hence is singular. As for A(£), it is absolutely continuous, being the 
integral of a function f(x) summable on every Borel set E. This proves 
the representation (4) and (5). 

We must still prove the uniqueness of the decomposition (4) and of 
the function f(x) figuring in (5). Suppose there are two decompositions 
of ®(£) into absolutely continuous and singular components: 

O(E) = A,(E) + S,(E), O(E) = AE) + S,(E). 
Then 
A,(E) — AE) = S(E) — S,(E), 
where the left-hand side is absolutely continuous, while the right-hand 
side is singular. But a function which is both absolutely continuous 
and singular must vanish, and hence A; = A, S; = Sg, as required. 
Finally, suppose there are two functions /,(x) and f,(x) such that 


[ ACD =] ACdu(dx) 
for every Borel set FE, and let f(x) = f,(x) — f,(x). Then 


|, fdu(dx) = 0, 


and choosing E to be the set where f(x) > 0, we find that ft(x) = 0 
almost everywhere. Similarly, f~(x) = 0 almost everywhere, and hence 
the same is true of f(x) = f*(x) — f(x). In other words, the function 
f(x) figuring in (5) is unique to within a set of measure zero, and the proof 
is complete. 


COROLLARY (Radon-Nikodym theorem). Let U be a o-ring of subsets 
of X equipped with a nonnegative Borel measure w(E) and an absolutely 
continuous countably additive set function P(E). Then 


O(E) =| Fuld»), 


where the function f(x) is u-summable on every Borel set E and is unique 
to within a set of measure zero. 


190 MEASURE AND SET FUNCTIONS CHAP. 9 


*9,3. Some Consequences of the Radon-Nikodym Theorem 


We now draw some important conclusions from the Radon-Nikodym 
theorem, allowing us to continue the study of linear functionals begun in 
Secs. 5.5 and 8.7. 


9.3.1. The general continuous linear functional on the space L(X). Given 
a set X, let L(X) be the space of all functions f(x) summable on X, equipped 
with the norm 


If = 10D =], $0) dx. 


Let o(x) be measurable on X, and suppose ¢(x) is also essentially bounded 
on X, 1.e., suppose there is a (finite) number C such that |g(x)| < C ona 
set of full measure (relative to /). Then the integral 


If =I fe) =|, fx) 9x) ax 


exists and defines a continuous linear functional on L(X), since it satisfies 
the following two conditions: 


1) If ft, fo are any two functions in L(Y) and «,, % are any two real 
nuinbers, then 


T (oft ar Oe fe) > ol fi + tol, fo 


2) If f,, € L(X) is a sequence such that || f,,|| +0 as m— oo, then 
I, fm —~ 9, as follows at once from the estimate 


Tofal = | Fu)9C%) dx | < 1 fl) e88 sup [C2 
where ve 


ess sup |p(x)| 
rex 


denotes the greatest lower bound of all numbers C_ such that 
lp(x)|< C on a set of full measure. 


Next we prove the analogue of Theorem 1, p. 95 and Theorem 12, p. 166: 


THEOREM 4. Given a continuous linear functional Jf defined on the 
space L(X), there exists a measurable essentially bounded function (x) 
such that 


If =|, FED) dx, (10) 
and moreover 
J] = eS euP lp(x)| (11) 


SEC. 9.3 MEASURE AND SET FUNCTIONS [9] 


where 
lJ = sup (df 
is the norm of the fungtional.* _ 


Proof. For every summable set £, with characteristic function yg, 


we write 


thereby defining a countably additive set function D(£). Since 
IP(E)| = Wxel < Ill xe = Wil xe = Idle), (12) 
@(E) is absolutely continuous. Therefore, according to the Radon- 
Nikodym theorem, 
O(E) =| o(x) dx = Hxx9), 
where (x) is summable on every summable set £. Moreover, o(x) 


satisfies the inequality 
ess sup |¢(x)| < [lJ . 
xe X 


In fact, suppose |p(x)| > ||J|| on a set E such that 0 < w(E) < o, 
and let E=E,UE_, where E, = {x:9(x)> |lJ||} and EL = 
{x:(x) < —|lJ||}. Then at least one of the sets E, and £_ has positive 
measure. If u(E,) > 0, we have 


Iye, = OE.) =] ox) dx > WI wD), 
while if u(E_) > 0, 
Fun | = 10) =], od] dx > |) eB). 


On the other hand, according to (12), 
IP(E..)| < lJ] u(E.). 


4 We remind the reader of the following elementary facts about the continuous linear 
functional J (which can take either sign, contrary to the notation of Sec. 2.11). Since J is 
continuous at | f|| =0, given any « > 0, there is a neighborhood /f |! <6 in which 
|Jf| < ¢. But then, since J is linear, 


) de 


Tog 


if || f || <1, and hence ||/|| exists. Moreover, if If || #9, 


| eee 
(5) = Tel < i 


ie. Jf] < lJ f ,an inequality which obviously continues to hold if Ifill — 0. 


192 MEASURE AND SET FUNCTIONS CHAP, 9 


This contradiction shows that u(£)=0. By construction, the two 
continuous linear functionals Jf and 


If =], f(e)@(x) dx 
coincide on the characteristic functions of all summable sets. But linear 
combinations of such functions are dense in L(X), and hence Jf = I, f 
for all fe L(X), which is just the relation (10). Moreover, 


ess sup lo(x)| < [JI] = [gil < ees SUP lp(x)I » 
TEL zE 
which implies (11) and completes the proof. 


9.3.2. The general continuous linear functional on the space L,(X). Given 
a set X, let L,(X), p > 1 be the space of all functions f(x) measurable on X 
such that 


Hf 1?) =[., FCO 2x < 00. 


Then, as shown in Sec. 6.9, L,(X) is a complete normed linear space, when 
equipped with the norm 


If lp =f 17). 
Let (x) belong to L,(X), where 


1 
ea 
p 
Then the integral 

If = 1(f¢) =| , f()9(x) dx 
exists and defines a continuous linear functional on L,(X), since it satisfies 
the following two conditions: 


1) If f,, fp are any two functions in L,(X) and a, a, any two real numbers, 
then 


I, (a fi — Xe fo) = Ol fi =. tel, fo. 
2) If f,, € L,(X) is a sequence such that || f,,||,—~0 as m— oo, then 
I, fm — 9, as follows at once from the estimate 
Hofal =| |. fax) 9(x) dx | < lfm llelle 
(use Hdlder’s inequality, p. 127). 
The appropriate analogue of Theorem 4 is now 


THEOREM 5. Given any continuous linear functional Jf defined on the 
space L,(X), there exists a function (x) € LX) such that 


If=], fe) dx, (13) 
and moreover 
lJ ll = loll. (14) 


SEC. 9.3 MEASURE AND SET FUNCTIONS 193 


Proof. First suppose u(X’) < oo, in which case L,(X) < L(X), since 
MADE < 10 f1?)10%. 


For every measurable (and hence summable) set £, with characteristic 
function y,, we write 
Then 

|D(E)| = |Jxzl < Wil lxell = Wl Oe) = IJ] we), 
and hence the set function ®(E) is absolutely continuous. Therefore, 
according to the Radon-Nikodym theorem, 


O(E) =| 9x) dx = I(yn9), 
where ¢(x) is summable. Moreover, (x) belongs to L,(X) and satisfies 
the inequality 
lella< II. 

In fact, suppose 0 < f < ||, where f is bounded and measurable. As 
on p. 121, the function f*? sgn » can be represented as the limit of a 
uniformly convergent sequence h, of elementary functions,*° which we 
take to be linear combinations of characteristic functions of the measur- 
able sets. It follows that 


If) < WF ll) = IF sgn @) = lim I(h,.9) = lim Jh, 


= J(f*™ sgn 9) < JU NF, = HN °C) 
(note that 4, also approaches f*~ sgn ¢ in the L,-norm). Therefore 
Poli FA) = PCF) < WI 
and letting f 7 |¢| (cf. p. 115), we deduce that 
P|) < lI, 


1e., 9 € LX) and ||9||, < |lJ||, as asserted. 
By construction, the two continuous linear functionals Jf and 


If =|, Flo @(x) dx 


coincide on the characteristic functions of all measurable sets. But linear 
combinations of such functions are dense in L,(X) [cf. Theorem 8, 
p. 130}, and hence Jf = J, f for all fe L,(X), which is just the relation 
(13). Moreover, 

elle < Ill = Well < Nelle (15) 
which implies (14). 


5 By sen x is meant the function equal to 1 if x > 0, Oif x = Oand —lifx <0. 


194. MEASURE AND SET FUNCTIONS CHAP, 


Finally suppose u(X) = 00. Then, as in Remark 2, p. 121, X = 
X,U X,U+:+, where X¥, ¢ X¥, <--> and every u(X,) < o. Given 


any fe L,(X), let 
SX) for xEX,, 


fod = {5 for xe X — X,,. 


Then obviously f, € L,(X,,) < L,(X), and hence, as just shown, there 
is a function 0, € L,(X,) < L,(X) such that 


If, =[\ fxCDenD 4x, Galle < In < WI, (16) 


where j/|,,, denotes the norm of J relative to the space L,(X,) and 
||| its norm relative to L,(X), as before. Let 
: ©,(X) for xe X,, 
@,(x) = 
0 for xE xX — X,, 
and consider the function 
p(x) = lim ¢,(x), 


where the limit exists, since 0,(x) = 9,,,(x) =--- if xe X,. Then (16) 
can be written in the form 


If, =] fldo(xdx, J IgGoltdx < ult 


It follows from the last assertion of Property c, p. 124 that ||* is 
summable on X and 


J,-leGaylt dx = tim J leapt ax < Wir 


le., p € LX) and 
IPlla < lly. 


Moreover, by the same argument, f,, > fin the L,-norm, and hence 


If = lim Jf, = im f. f)9(x) dx =|. Dox) dx. 


This proves (13), and (14) again follows from (15). 


9.4. Positive, Negative and Total Variations of the Sum 
of Two Set Functions 


9 


Let ©,(£) and ®,(£) be two countably additive set functions with positive 
variations P,(E) and P,(£), and let P(E) be the positive variation of their sum 


O(E) = ®,(£) ale ®,(E). 


SEC. 9.4 MEASURE AND SET FUNCTIONS 195 


For any A < E we have 
@(A) = ®,(A) + ®,(A) < P(E) + P(E), 


and hence 

P(E) < P(E) + PE), (17) 
after taking the least upper bound of the left-hand side. The equality 

P(E) = P,(E) + P,(E) (18) 


does not hold in general, as shown by the example ©,(£) = —®,(E) + 0. 
On the other hand, we can easily establish the following 


THEOREM 6. Jf ®,(E) and DE) are concentrated on disjoint sets E, 
and E,, then (18) holds. 


Proof. Given any E and « > 0, we can find sets A, © EE, and 
A, © EE, such that 
P(E) = P,(EE,) < ®,(A;) + «, 
PE) = PEE.) < ®,(A2) + ¢, 
and henée 
P(E) > ®(A, + A,) = O(A,) + ®(A,) 
= ©,(A,) + O(A,) > P(E) + P(E) — 2e, 
since A, and 4, are disjoint. Letting <« approach zero, we obtain the 
inequality 
P(E) > PE) + PE), 
which, together with (17), implies (18). 
CoroLiary |. Jf O,(E) and OE) are concentrated on disjoint sets 
E, and E,, then 
O(E) a 0,(E) = 0(E), 
V(E) = VE) + VE), 
where Q,, Qs, Q are the negative variations of the functions ®,, D,, 
® = 0, + ®,, and V,, V2, V are their total variations. 


COROLLARY 2. The total variation of any countably additive set func- 
tion equals the sum of the total variation of its continuous component and 
the total variation of its discrete component. 


COROLLARY 3. The total variation of any continuous countably addi- 
tive set function equals the sum of the total variation of its absolutely 
continuous component and the total variation of its singular component. 


Proof. If ®,(E) is absolutely continuous and ®,(£) is singular 
(and hence concentrated on a set Z of measure zero), then ®,(£) is 
concentrated on X — Z, since it vanishes on every Borel subset 


BEX 2(¥ 37) 27 


196 MEASURE AND SET FUNCTIONS CHAP. 9 


9.5. The Case X = [a, b]. Absolutely Continuous 
Point Functions 


Suppose the countably additive set function ®(E) is defined on a o-ring 
9 = Ala, b] of subsets of a fixed finite interval [a, 5], where 2 contains 
all subintervals (a, 8] < [a, b],* and hence all classical Borel sets. Let u(£) 
be ordinary one-dimensional Lebesgue measure. Then ®(£), regarded as a 
quasi-length, is characterized by its generating function F(x) = ®[a, x]. 
Obviously, F(x) is of bounded variation and continuous from the right. 
Moreover, if ®(£) is continuous, then so is F(x), and conversely, since 

F(x) — F(x — 0) = lim [F(x) — F(®)] = lim ®(&, x] = ®({x}) = 0. 

Sr 67x 

To characterize the generating function of an absolutely continuous set 

function, we will need the following 


DEFINITION. A point function F(x) defined in the interval [a, b] is said 
to be absolutely continuous in [a, b] if, given any = > 0, there exists a 
5 > 0 such that 


2 IF) — Fa) <e (19) 
for every collection of disjoint subintervals («,, 8,] < [a, 6] such that 
> (By — a) <8. 
k=1 


Remark. An equivalent definition is obtained by replacing the inequality 
(19) by 


<é 


SFG) — Fa) 


[obviously implied by (19)]. In fact, given any < > 0, let 5 > 0 be such that 


| SIF (G,) — F(ae)]} < = (20) 
k=1 2 

for every collection of disjoint subintervals (,, 8,] < [a, 5) whose total 
Jength is less than 3. In general, F(8,) — F(a,) > 0 for some of the sub- 
intervals (,, 8,J, and F(®,) — F(x,) <0 for the others. Correspondingly, 
the sum in (20) splits into two sums 


SG) — Fld — S1FG,) — Fo) 


® As usual, we replace (a, 3} by [«, BJ if x = a. Note that D(E) is defined for E = X¥ = 
[a, 5}. 


SEC. 9.5 MEASURE AND SET FUNCTIONS 197 


where 


Sr@)—Feal<2, | S1F@) — Fell <é. 


since the sum of the lengths of each collection of subintervals is obviously 
less than 6. But then 


<6, 


> IF) — Fa) = SFG) — Fla) + | DFG) — Fe) 


i.e., (19) holds, as required. 


Next we note some simple properties of absolutely continuous point 
functions: 


1) If F(x) is absolutely continuous in [a, b), then F(x) is uniformly con- 
tinuous in [a, b}. 

2) If F(x) is absolutely continuous in [a,b], then F(x) is of bounded 
varidtion in [a,b]. In fact, if 8 corresponds to the choice ¢ = | in 
(19), then the total variation of F(x) is less than | in every subinterval 
(a,, B.] < [a, b] of length less than 8. Therefore 


VAP) <N-1<2=4 44, 


where AN is the smallest number of disjoint subintervals («,, 6,] needed 
to cover [a, b] and V°(F) is the total variation of F(x) in [a, b]. 

If F(x) is absolutely continuous in [a, 6], then so are its positive, negative 
and total variations. For example, let V(x) be the total variation of 
F(x), and given any < > 0, let 3 > 0 be such that the inequality 
(19) holds for every collection of disjoint subintervals («,, 8,] with 
total length less than 8. By definition of the total variation (cf. p. 85), 


3 


ww’ 


the sum 
> (VB) — Via] = 2 Vat Bx] (21) 
k=1 k= 
is the least upper bound of the quantity 
N mk 
> > IFO”) — FOP, (22) 
k=1 j=1 
where a, = xg" <--- < xt) = Q, is an arbitrary partition of the 


interval («,, %,.J]. By construction, the sum of the lengths of the disjoint 
subintervals (x}*}, x'*’] is less than 5, and hence every sum (22) is less 
than ¢. But then their least upper bound (21) cannot exceed «, Le., 
V(x) is itself absolutely continuous, as asserted. 


198 MEASURE AND SET FUNCTIONS CHAP. 9 


After these preliminaries, we now prove 

THEOREM 7. A countably additive set function O(E) defined on X = 
Wla, b] is absolutely continuous if and only if its generating function F(x) 
is absolutely continuous in {a, 6]. 

Proof. Let F(x) be absolutely continuous in [a, 5], where, by Prop- 


erty 3 above, there is no loss of generality in assuming that F(x) 1s 
nondecreasing. Given any e>0, choose 6 >0 such that 


SFG) — Fed <e (23) 


for every disjoint collection of subintervals (~,. 3,J] < [a,b] of total 
length less than 5. Then, given any Z € YX of measure zero, let {A,} be a 
countable collection of subintervals of total length less than 6 covering Z 
(cf. Sec. 1.4). These intervals can be made disjoint by the usual device 
of replacing A,, A,, Ag, ... by 


A,, Ay > A, A), A, — A, A; ~ AAs, sey 


if necessary. Because of (23), we have 


o( U A.) = SHY <e 


for every finite subcollection Aj,..., A,. But (£) is countably additive, 
and hence 


o(G.s,) ~imo(Ua,) <« 
ke=l no = 
which implies " - 
O(Z) < ¢g, 
since 


Z<UA,. 
k=1 
Therefore ®(Z) = 0, since « > 0 is arbitrary. 


Conversely, if @(£) is absolutely continuous, then, by the Radon- 
Nikodym theorem, 


O(E) =|. g(x) dx, 


where g(x) is summable on [a, 5]. In particular, 


F(x) = fa, x] =] g(&) dé, 


and hence 
SFG)—Fey=S[M@ae= [ wd 28 


A,U:-UA, 


SEC. 9.6 MEASURE AND SET FUNCTIONS 199 


where A, — (a, 6,]. But by Property d, p. 124, the right-hand side of 
(24) approaches zero as the total length of the intervals A,,...,A 
goes to zero. This completes the proof (recall the remark on p. 196). 


n 


CoROLLARY. Jf F(x) is absolutely continuous in [a, b}, then 


F(x) = F(a) + |"9(@) a, (25) 


where g(x) is summable on {a, b]. Conversely, every function of the form 
(25) is absolutely continuous in [a, b}. 


9.6. The Lebesgue Decomposition 


We continue our study of the case X = [a, 5], by allowing D(E) to have 
singular and discrete components. First we prove 


THEOREM 8. A countably additive set function P(E) defined onU = 
Ula, b] is singular if and only if its generating function F(x) is singular in 
[a, b] in the following sense:’ Given any < > 0, there exists a finite collec- 
tion of subintervals («,, B,| < [a, b] such that 


362 — %) <s > IF) — FI > V{F)—e. (26) 


Proof. First suppose P(E) is singular, and let Z be the set of measure 
zero on which O(£), and hence its total variation V(E), is concentrated. 
As we know from Sec. 4.7.4, regarded as a quasidength, V(E) has the 
generating function 


VaCF) = sup FO) — FOI. 


where the least upper bound is taken with respect to all partitions 
a=X<‘''< x, =x of the interval [a, x].8 Given any e«> 0, let 
(x,. B,J], k = 1,2,... be a countable collection of disjoint subintervals 
covering Z such that 


> (a) <e 


7 This presupposes that a singular function is of bounded variation. A condition 
equivalent to (26) is the following: Given any e — 0, there exists a finite collection of sub- 
intervals («,, B;,) © [a, 6] such that 

n n 
> (6, —a)>b-a-e, > \F(8,) — F(a,)| <«. 


k=1 k=1 
8 In particular, V2(F) = V{a, 6). 


200 MEASURE AND SET FUNCTIONS CHAP. 9 


Clearly 
LV (% Be] = ValF), 
k=1 


since V(E) is concentrated on Z. Let n = n(e) be such that 
n € 
> Vox: Be] > VaCF) — => 
k=1 2 


and then, in each of the 7 subintervals (x, 6,],...,(«,, 6,], choose 
points , = xf <--+< x}? = B, such that 
mr 


€& 
> F(x?) — FOE > Vos Be] — Ae 


j=1 


It follows that 


rn ME : ? r € 
> > |F(x}”) = F(x®?,)| > > V(x, B.} — 5 > Vi(F) — €, 
ko 


k=15=1 
which proves (26), since obviously 


DxGs —~ %) <e. 


Conversely, suppose F(x) satisfies (26). Then, by Theorems 1 and 3, 
®(E) can be written in the form 


@O(E) = A(E) + S(E), 


where A(£) is absolutely continuous and S(£) is singular but not neces- 
sarily continuous.? Correspondingly, F(x) itself has the representation 


F(x) = A(x) + S(x), (27) 


where, by Theorem 7, A(x) is absolutely continuous. By hypothesis, given 
any « > 0, there is a collection of disjoint subintervals («,, 8,} < [a, 5] 
of total length less than e such that 


S IFC) — Fadl > VF) —« 
and hence - 
¥ 1S(B,) — S(a,)| > VF) —¥ 1AG,) — ADI — 2 


because of (27). The sum on the right approaches zero as ¢ — 0, because 
of the absolute continuity of A(x). Therefore 


> |S(8,) — S(@)| > VUE) — ', 


k=] 


® Recall from p. 184 that a discrete set function is singular. 


SEC. 9.6 MEASURE AND SET FUNCTIONS 20] 


where « > 0 is arbitrary, and hence the total variation of S(x) cannot 
be less than V?(F). But then, by Theorem 6, Corollary 3, the total vari- 
ation of A(x) vanishes. It follows that A(x) = 0, and hence 

F(x) = S(x),  B(E) = S(E), 
as required. 


Next we consider the case where ®(E) is discrete. Then, according to 
Sec. 9.1, 


@O(E) = 2 8» (28) 
where ¢, Co,... 1S a sequence of points in [a, b] and gj, go,... 1S a corre- 
sponding sequence of real numbers such that 

> 8x1 < 0. 

k=1 
In particular, the generating function of ®(£) is given by 

. F(x)= D> & 
c,e[a, x] 


THEOREM 9. A countably additive set function ®(E) defined on XU = 
Ula, b] is discrete if and only if its generating function F(x) is a jump 
function in the following sense: F(x) is continuous from the right in 
[a, b],!° and given any <= > 0, there exist finitely many points of discon- 
tinuity Cy,...,C, Of F(x) such that) 


> \F(c,) — F(c, — 0)| > VF) —«. (29) 


Proof. First suppose ®(£) is discrete, and let c,,co,... and g,, 
£2,... be the points and real numbers figuring in (28). Then 


IF(c,) — F(c, — 9) = |®({er})] = lel. 
VF) = lel 


which implies (29). Conversely, suppose F(x) satisfies (29). Then, by 
Theorem 1, 
P(E) = C(E) + D(E), 


where C(E) is continuous and D(£) is discrete. Correspondingly, F(x) 
has the representation 
F(x) = C(x) + D(x), 


10 Recall that the generating function of a countably additive set function P(E) is 
automatically continuous from the right, since ®(£) is (upper) continuous, regarded as a 
quasi-volume (cf. Sec. 5.6). 

11 Jn particular, F(x) can have no more than countably many points of discontinuity. 


202 MEASURE AND SET FUNCTIONS CHAP. 9 


where C(x) is continuous. By hypothesis, given any < > 0, there are 
points c,,..., ¢, € [a, b] such that 


> |F(c;.) — F(c,. — 0)| > Vi(F) — &, 
kK 1 

or equivalently, 
> |D(c,) — D(c, — 0)| > Va(F) — «; 
k=1 


since C(x) is continuous. Therefore the total variation of D(x) cannot 
be less than V>(F). But then, by Theorem 6, Corollary 2, the total 
variation of C(x) vanishes. It follows that C(x) = 0, and hence 

F(x) = D(x), P(E) = D(£), 
as required. 


Finally, combining Theorem 3 and the last three theorems, we obtain 


THEOREM 10 (Lebesgue decomposition). If F(x) is the generating 
function of a countably additive set function O(E) defined on = Al[a, 5], 
then F(x) can be represented in the form 


F(x) = A(x) + S(x) + DC»), (30) 


where A(x) is absolutely continuous, S(x) is continuous and singular, and 
D(x) is a jump function. Moreover, A(x) itself has the representation 


A(x) = |" o(&) a8, 
where the function g(x) is summable on [a, 6]. The representation (30) is 
unique, and so is g(x), to within a set of measure zero. 


COROLLARY |. The conclusion of the theorem remains true if F(x) 
is of bounded variation and continuous from the right in [a, b], except that 
now 


A(x) = F(a) + |* g(2) a8. 
Proof. Consider the generating function F(x) — F(a). 


COROLLARY 2. The conclusion of the theorem remains true if F(x) 
is of bounded variation in [a, 6], but not necessarily continuous from the 
right, except that now 


x 
A(x) = F(a) + {* (€) a 

and D(x) is a jump function in the sense that given any « > 0, there exist 
finitely many points of discontinuity c,,..., Cp, of D(x) such that? 

7 

Y [D(ce + 0) — D(c, — 0)| > VD) — e. 

k= 1 

12 Of course, in general, D(x) itself is no longer continuous from the right. Again, D(x) 
can have no more than countably many points of discontinuity. 


SEC. 9.6 MEASURE AND SET FUNCTIONS 203 


Proof. As we know from p. 85, F(x) is the difference between 
two bounded nonnegative nondecreasing functions. Therefore, by an 
elementary argument,!? F(x — 0) and F(x + 0) exist for all x € [a, b). 
Moreover, there are at most countably many points where F(x — 0) A 
F(x + 0), since the inequality 


FeO FeSOIs] =i.) 
n 
can hold at no more than M(n) points, where M(7) is the largest integer in 
1 cry. 
n 


Hence there is a jump function D,(x) such that F(x) + D,(x) is con- 
tinuous from the right. The proof now follows by applying Corollary 1 
to the function F(x) + D,(x). 


PROBLEMS 


1. Let «/ be the space of all absolutely continuous functions defined on the 
closed interval [a, 5]. Show that »/ is a closed subspace of the space # of all 
functions of bounded variation in [a, b] (cf. Prob. 8, p. 87). 


2. Let Y be the space of all singular functions defined on [a, 5]. Show that 
S is a closed subspace of %. 

3. Let % be the space of all jump functions defined on [a, 5]. Show that 7 is 
a closed subspace of #. 


4. Show that the Cantor function C(x) of Prob. 2, p. 86 is singular. 


Hint. The corresponding set function C(£) vanishes in every interval 
adjacent to the Cantor set (see Prob. 2, p. 21). Recall footnote 7, p. 199. 


5. Show that if every term ®(£,), O(£,),... of an everywhere convergent 


series 
D(E) = D(E,) + PER) + °°: 


is a nonnegative countably additive set function defined on some o-ring 4, 
then so is the sum ®(£). Show that if every term is absolutely continuous 
(relative to some measure i), then so is P(E). 


Hint. The relation 


y O(E,,) = lim >) > ©,(E,) < lim ®,(£) = 0(E) 


m=1 pO m=1 k=1t ee a 


BO SCeHe: Oo Tata Apostol, Mathematical Analysis, Addison-Wesley Publishing Co., 
Reading, Mass. (1957), p. 78. 


204 MEASURE AND SET FUNCTIONS CHAP. 9 


implies 
> PE,) < OE). 


m=1 


On the other hand, given any e > 0, we can choose p such that 
Dp 
> O,(E) > P(E) — « 


k=l 
and then 7 such that 


S (En) > 0,(E) se (k =1,...,p). 


m=1 
Therefore 
n p n y) 
> E,) > > DY Of(En) > > OE) — ¢ > P(E) — 2¢, 
m=1 k=1 m=1 k=1 
and hence 


J O(E,,) > OE). 
m=1 


6. Construct a singular function F(x) on the interval [0, 1] with no intervals 
of constancy. 


Hint. Construct F(x) as a series of functions of the Cantor type (cf. Prob. 4) 
such that F(x) is concentrated on a set Z dense in [0, 1}. Use one of the results 
of Prob. 5. 


7. Using the function F(x) of Prob. 6, show that a measurable function of a 
continuous function need not be measurable. 


Hint. The function F(x) maps a set of full measure into a set of measure 
zero. Consider a nonmeasurable subset W ¢ E and let-G(y) be the characteristic 
function of the set F(W). Show that G[F(y)] is nonmeasurable. 


8. Prove that a nondecreasing function F(x) = const is singular if and only if 
it maps some set of measure zero into a set of full measure, or some set of 
full measure into a set of measure zero. 


9. Prove that the inverse of a continuous singular function with no intervals 
of constancy 1s itself singular. 


Hint. Use Prob. 8. 


10. Let ®(E) be a countably additive set function defined on a o-ring W of 
subsets of X, equipped with a Borel measure yu, and suppose ®(£) is not 
absolutely continuous with respect to u. Let Y& be a Lebesgue extension of YW. 
Show that in general ®(£) cannot be extended onto the o,,-ring U. 


Hint. Let X be the interval [a, b], let » be ordinary Lebesgue measure, 
and let ®(E) be a countably additive set function defined on all the Borel 
subsets of X. If ®(E) is continuous but not absolutely continuous, there 
exists a noncountable set E, < X, u(£y) — 0 such that ®(£,) 4 0. Construct 
a ®-nonmeasurable set E, © Ey which is Lebesgue measurable (with measure 
zero). 


10 


THE DERIVATIVE 
OF A SET FUNCTION 


4 


10.1. Preliminaries. Various Definitions of the Derivative 


Let D(E) be a countably additive (finite) set function defined on a o-ring 
YW of subsets E < X, equipped with a nonnegative Borel measure u(£). 
As in Sec. 9.1, if X €U, we assume that X¥ = XY, U X, U-::, where 
X,< X, c++: and every X, 6%. This situation will be summarized by 
saying that ““X is a set equipped with a measure u” and “®(£) is a countably 
additive set function (defined) on X.”’ 

Suppose ®(E£) is absolutely continuous with respect to w(E). Then, 
according to the Radon-Nikodym theorem, ®(£) can be represented in the 


form 
O(E) = | 9(x)u(dx) 


in terms of a u-summable function g(x), which we shall call the density 
of ®(£). This immediately raises the question of how to find the density of 
M(E), starting from a knowledge of ®(£) itself. 

In simple cases, the procedure for finding g(x) is familiar from elementary 
calculus. For example, let X be a finite interval [@, 6], equipped with ordinary 
Lebesgue measure, and let ®(£) be the set function with generating function 
F(x), so that ; 
F(x) = ®[a, x] = |* g(&) a8. 

If g(x) is continuous, then, as is well known, g(x) can be obtained by 
differentiating F(x), 1.e., 
A= in F(x) + h) — F(x) — lim D(x, Xp + hl 
h-0 h h-0 h 


205 


(1) 


206 THE DERIVATIVE OF A SET FUNCTION CHAP. 10 


where ®(x), x) + A] ~ —®P(% + A, Xo] if A <0. Similarly, if B is an n- 
dimensional block, again equipped with ordinary Lebesgue measure, and 
if g(x) = g(x,,...,X,) is continuous, then 


1 
a(x) = tim |, a6) ab, 


where s(B) is the volume of the block BC B, and the limit is taken with 
respect to an arbitrary sequence of blocks {B,,} converging to the point x9. 

Formula (1) gives the usual definition of differentiation with respect to 
one-dimensional Lebesgue measure, but it is hardly the unique definition. 
In fact, as we now show, (1) can be either weakened or strengthened, while 
remaining a perfectly plausible definition. 

Suppose first that instead of using intervals of the form (x, x + A], we 
restrict ourselves to intervals from a much smaller class, for example, intervals 


of the form 

P pti 

q7 , qn f 
whose end points are binary numbers, where every interval contains the 
point x, at which the derivative is to be evaluated. In other words, we 
define the derivative of the countably additive set function P(E) by the 


formula 
cera oe ol 
F' (xp) = lim se = lim ee aE , 
n-> o A. no 1 (2) 
2” 2° 


— (2 ptt), 
in 
provided the limit exists. This definition is weaker than (1), in the sense that 
the existence of the limit (1) implies that of (2), but not conversely. In fact, 
suppose the limit (1) exists at some point x). Then the expression 


oF Pa] Mom] OP 
a af i, pi a 2 


kt Es eo 
a Zc 2 
o( 2 we) ( p+) p+1_ 
7 (? “| ee, P| Xo ry ; 
ey ae ae re 1 
Ze ZF Ze 2 
1 In the sense that the size of B, (see p. 7) approaches zero as n > oo, and 
1) Bn = {Xo}. 


n=1 


SEC. 10.1 THE DERIVATIVE OF A SET FUNCTION 207 


is a weighted average of two expressions approaching F’(x9) aS n > oo, and 
hence itself approaches F’(x 9). On the other hand, as shown in Prob. 1, 
p. 223, given an irrational point x», we can always construct a continuous 
function F(x) of bounded variation such that 


a) F(x) has no derivative in the ordinary sense at the point x; 
b) F(x) vanishes at the end points of the intervals p2-" < x < (p+ 1)2™ 
and hence has a (zero) derivative at x, in the sense of formula (2). 


Next, instead of using intervals of the form (x, x }+ A], we consider sets 
from a much /arger class. Thus we now define the derivative of the countably 
additive set function D(£) by the formula 


_ P(E.) 
lim —— 
n> oO u(E,,) 
(provided the limit exists), where uw is ordinary Lebesgue measure and 


E,, Ey,... 18 an arbitrary sequence of Borel sets “converging regularly” 
to the point x, in the sense that 


(3) 


1) Every £, is contained in a half-open interval A, = (a,, b,] such that 
x,€ A, and a,, b, > xX) asn— ©; 
2) There is a fixed constant c > 0 such that 


u(E,) > cu(A,) 
for every n. 


This definition is stronger than (1), since the existence of the limit (3) implies 
that of (1), but not conversely. In fact, suppose the limit (3) exists at some 
point xo, and choose E,, = (Xo, Xo + A], Sy == (%o — Ans Xo > Ay], where h, 
is an arbitrary sequence converging to zero as n-» oo. Then £,, converges 
regularly to the point x, (with c = 4), so that the limit 


mn xe? Xo + h,] 
n~— 00 u(E,) nm oO h,, 


equivalent to (1), exists (there are obvious changes for negative h,). On 
the other hand, the derivative of the function 


pede cll 

x” sin - 

x 
in the sense of formula (1) exists and equals zero at the point x» = 0, whereas 
the derivative in the sense of formula (3) does not exist (see Prob. 2, p. 223). 
The above considerations notwithstanding, it turns out that differences 
between the definitions (1), (2) and (3) appear only “at separate points.” 
In fact, as we shall soon see, the three definitions (1), (2) and (3) of the 
derivative of a function of bounded variation (or their analogues for a countably 


208 THE DERIVATIVE OF A SET FUNCTION CHAP. 10 


additive set function defined on a general set X) exist and coincide on a set 
of full measure. 


10.2. Differentiation with Respect to a Net 


The appropriate generalization of differentiation in the sense of formula 
(2) is differentiation with respect to a net. Let X be a set equipped with a 
measure yp, and suppose X is the union of a countable family Yt, of disjoint 


Borel sets 4j”,..., AS’, ..., called sets of the first rank. Suppose further 
that every set A‘) of the first rank is itself the union of a countable family 
of disjoint Borel sets A\?),..., A‘?’,..., called sets of the second rank, and 


let Jt, denote the family of all such sets where j is arbitrary. Imagine this 
process repeated for every integer n, leading in each case to a family Mt, of 
disjoint Borel sets, said to be of the n’th rank, whose union is the original 
set X. Then the family 


of all sets of every finite rank is called a net, provided §t is completely 
sufficient in the sense of Sec. 7.4 (note that Jt is a semiring). 

Now let ®(£) be a countably additive set function defined on X (and 
hence on Yt). Then, by the derivative of P(E) at the point x9 with respect to 
the net JN, we mean the quantity 


on Aa] 
Dy({Xo) = lim —————= 
Bo) = poe uA, (0) 


(provided the limit exists), where A,(x») is the unique set of the ath rank 
containing x . According to a theorem of de Possel,® the derivative of O(£) 
with respect to any net Jt exists on a set of full measure (for suitable X) and 
coincides with the density of the absolutely continuous component of ®(£). In 
particular, it follows that the derivative is independent of Yt. 


(4) 


Example. Let X be the unit interval [0,1], equipped with ordinary 
Lebesgue measure, and let the sets of the mth rank be 


2) (£2)... (2a) 
2 ee 2 2 


Then the derivative with respect to the corresponding net Jt is the derivative 
in the sense of formula (2). 


2 De Possel’s theorem will be deduced in Sec. 10.4 from the more general Lebesgue- 
Vitali theorem, which treats the analogue of ‘‘ordinary differentiation,”’ 1.e., differentiation 
in the sense of formula (1). 


SEC. 10,3 THE DERIVATIVE OF A SET FUNCTION 209 


10.3. Differentiation with Respect to a Vitali System. 
The Lebesgue-Vitali Theorem 


Next we consider the generalization of differentiation in the sense of 
formula (1). Again let X be a set equipped with a measure pu, but now suppose 
every set {x} consisting of a single point x € X is measurable, with measure 
zero. By a Vitali system we mean a family ¥ of Borel sets E < X (called 
Vitali sets) which has the following properties: 


1) Given any Borel set (or generalized Borel set) E and any < > 0, there 
are countably many Vitali sets A,, A, ... such that 


U A, > 8, (0 4,} < u(E) + «. 


n=1 
2) Every set Ee Y has a boundary, 1.e., a set 1(£) of measure zero such 
that 


a) If x e E — EI(E), then every Vitali set of sufficiently small measure 
cofitaining x is contained in E — EI(E); 

b) If x¢ E =~ E UTI(E), then every Vitali set of sufficiently small 
measure containing x does not intersect E. 


3) Suppose E < X is a set covered by a subsystem 8 < ¥ of Vitali sets 
such that for any x € E and any « > 0, there is a set A,(x)€B of 
measure less than ¢ which contains x. Then E can be covered to within 
a set of measure zero by countably many disjoint sets A, € B. 


Now let ®(£) be a countably additive set function defined on X (and hence 
on ¥). Then, by the derivative of P(E) at the point xy with respect to the 
Vitali system ¥ , we mean the quantity 


. OA, 

c+0 L[A.(Xo)] 
(provided the limit exists), where A,(Xo) is any Vitali set of measure less 
than < containing the point x9. In any event, the quantities 


D®(x,) = lim O[A.(X)] 
e>0 L[A.(%9)] 


DO(x9) = lim PLA} 

7 20 HLA.(%)] 
always exist (provided we allow the values +0), where D®(x,) is called the 
upper derivative and D®(x,) the lower derivative of P(E) at the point x» with 
respect to the Vitali system ¥. A necessary and sufficient condition for 
@(E) to have a derivative at x) with respect to VW is obviously that 


D®(x)) = D®(x9) # 0. 


and 


210 THE DERIVATIVE OF A SET FUNCTION CHAP. 10 


In general, the quantities D(x), D®(x,) and D®(x,) depend not only on the 
function P(E) but also on the underlying Vitali system ¥ . However, as we 
shall see presently, effects of this sort manifest themselves only on a set of 
measure zero. First we establish some necessary preliminary results: 


LemMa 1. Jf the inequality 


<0 u[A.(x)]. 
(c fixed) holds at every point x ofa set E — X of positive measure (where X 
is a Borel set*), then, given any = > 0, there isa set Q © E such that 


wWE— QO)<e, O(Q)> cu(Q). 


Proof. Given any < > 0, we use Property | to cover the complement 
6 E of the set E by a countable family of Vitali sets B,, B,, .. . such that* 


u( U B, < UGE) +. 
n=1 

Letting B denote the union of all the sets B, = B, UT(B,), we now use 
mathematical induction to construct a sequence of sets Q,, Qs,..., 
each a union of no more than countably many disjoint Vitali sets, such 
that O(Q,) > cu(@,) and 


E—EB—Z,<¢ Q,< 6B, V+*: UB,), 


where each Z, (n = 1, 2,...)18 a set of measure zero. First setting n = 1, 
for every point x € E — BE we find all the Vitali sets 4$7(x) which 
contain x,° do not intersect B, and satisfy the inequality 


O[An(x)] 

p{As(x)] 
By Property 2b such sets exist and have arbitrarily small measure. Then 
from this covering of E -- BE, we use Property 3 to select a sequence 


of disjoint Vitali sets 4)\’ covering E BE to within a set Z, of measure 
zero. Thus, if Q, is the union of all the A‘, we have 


E—BE—Z,< Q,¢ (A), 


’ This restriction is removed in the remark on p. 212. 
4 Here we tacitly assume that E + X.1f E = X, we apply the same argument to a proper 
subset E“’ © E and its complement E'?’, eventually obtaining 


WEY — OY) + WE — Q) = WE — Q) < 2¢, 


where Q = Q') U Q”) and ®(Q) > cu(Q) [note that OQ and Q”), like E™ and E, 
are disjoint}. 
> Here the dummy index « ranges over a set which is general uncountable. 


SEC. 10.3 THE DERIVATIVE OF A SET FUNCTION 2I| 


and moreover, 
O(Q,) = 2, O43") > c> (A?) = cu(Q)). 
j= j=1 


Next, suppose we have already managed to construct the sets 
jee) 
Q, pda io Q,-1 rr U Ay, 
j-1 


and the corresponding sets Z,,...,Z,_, of measure zero. Then the sets 
Q,, and Z, can be found as follows: Let 


Ti ma U TAP) Ar} 
j=l 


where u( Y,,) = 0 since every P(A{"-”’) is of measure zero. For every 
pointxe E— BE—Z,_, - Y,, we find all the Vitali sets A{”’(x) which 
contain x, are contained in Q, ,, do not intersect B, U--- UB, and 
satisfy the Inequality 

O[A,” (x) 


pl As” (x)] 


By Properties 2a and 2b, such sets exist and have arbitrarily small 
measure. Using Property 3, we construct a countable union Q,, of disjoint 
Vitali sets A” covering E— BE Z,_,— Y, to within a set Y/ of 
measure zero. Then 


E— BE —Z,4— Y,—Y,°CQ,°¢ €(B, U-::UB,), 
which can also be written as 
E— BE—Z,< 0, - 6B, U::: UB,), 


where Z,, = Z,_,; U Y, U Y/ is again a set of measure zero. Moreover, 
©(0,) = J (AY) > eD u(AS”) = cul, 
j= j= 


and we have succeeded in constructing the desired sequence Q, > 
Q,> "°°. 
Finally let 
Then = 
(Q) = lim &@Q,) > c lim w(Q,) = eu(Q). 


RCO 


212 THE DERIVATIVE OF A SET FUNCTION CHAP. 10 
while on the other hand, 


z~—BeE—Uz.¢ o< (Us, < E, 
n=1 nel 


where 
u(E — Q) < p(BE) + 2 u(Z,,) = u(B) — u(@E) <e. 
This completes the proof. - 


Remark. The assumption that X is a Borel set has been used (tacitly) 
to deduce that GE EY (where Y is the underlying o-ring) and to write 
u(6£). However, this assumption can be dropped. In fact, if X ¢ U, then 
X= X,U X,U--::, where X, © X, < +--+ and every X, € UW. Given any 
EeXY, let E™ = EX,, so that 


oo U E™. 


n=1 
Then, given any < >> 0, we use Lemma | to find a set Q) < E'” such that 
wE™ — OM) <e, O(Q™) > cu(Q™) 


for every n = 1, 2,..., and taking the limit as m — oo, we obtain 


w(E—9)<s, O(Q)> cn(Q), Q= UQ™, 
as required. 7 
LemMA 1’. If the inequality 


D®(x) = lim PIA)| <c 
7 E70 u[A.(x)] 


(c fixed) holds at every point x of a set E < X of positive measure, then, 
given any « > Q, there is a set Q © E such that 


uE— Q)<e, ®(Q) < cy(Q). 


Proof. Replace ®(E) by 2cu(£) — ®(£) in Lemma 1}, and use the 
remark. 


LEMMA 2. The set 
E, = {x: D®(x) > c} 


is measurable for arbitrary real c. 


Proof. Given any < > 0, we can cover every point x € E, by a Vitali 
set A,(x) of measure less than e€ such that 


OA] — 
WIA.) 


SEC. 10.3 THE DERIVATIVE OF A SET FUNCTION 2)3 


Then the set 
0. = U A,(x) 
Lek 
is measurable. To see this, let B be the subsystem of all Vitali sets with 
the property that every set of 8 is contained in at least one of the sets 
A{x).° Then 8 covers Q,, and in fact Q, coincides with the union of 
all the sets of 8. It follows from Property 3 that 
0, = UA,, A,;e8, 
j=l 

to within a set of measure zero, and hence Q, 1s measurable, as asserted. 

Now let {e,} be a sequence of positive numbers approaching zero as 
n—» oo. Then the set 


is measurable. Obviously 
; Q>E,, (5) 
and moreover 
Oc E.. (6) 


In fact, given any point x) € Q and any < > O, there is at least one set 
A, (Xo) © Q,, of measure less than ¢,, containing Xo, since X9 also belongs 
to every Q, . But then 

= —___ D[ A —5 

D®(x9) = lim PlAc,(%o)] > lim (c — ¢,) = 

Nw u[A.,(Xo)] no 

which proves (6). Comparing (5) and (6), we find that E, = Q. Therefore 
FE, is measurable, and the lemma is proved. 


= 


We are now in a position to prove the basic theorem on differentiation 
with respect to a Vitali system: 


THEOREM | (Lebesgue-Vitali theorem).’ Let X be a set equipped with 
a measure wv, let ¥ bea Vitali system of Borel subsets of X, and let D(E) 
be a countably additive set function on X. Then the derivative of P(E) 
with respect to ¢ exists on a Set of full measure, and coincides with the 
derivative of the absolutely continuous component of Y(E). 


Proof. Let 
O(E) = S(E) ++ ACE). ACE) =], o)(@20, (7) 


6 In particular, 8 contains every set A,(x). 

* The first abstract formulation of the Lebesgue-Vitali theorem (for an absolutely 
continuous set function) is due to Y. N. Yunovich, Sur la dérivation des fonctions absolument 
additives d’ensemble, Dokl. Akad. Nauk SSSR, 30, 112 (1941). 


214 THE DERIVATIVE OF A SET FUNCTION CHAP. 10 


where S(£) is singular (but not necessarily continuous), and A(E) is 
absolutely continuous. We shall prove the theorem in two steps: 


Step 1. DS(x) exists almost everywhere and equals zero. Let Z be the 
set of measure zero on which S(E) is concentrated. Since the positive and 
negative variations of S(£) are both singular and concentrated on Z, 
there is no loss of generality in assuming that S(Z) > 0 and hence 


DS(x) > 0. (8) 
According to Lemma 2, the set 
E, = {x: DS(x) > c} N EZ 


is measurable.2 Moreover, £, has measure zero, since otherwise, 
according to Lemma I, there is aset Q © E, of positive measure such that 


S(Q) > = u(Q) >0. 


But this is impossible, since Q does not intersect the set Z on which 
S(E) is concentrated. Therefore u(E,) = 0, and hence 


u({x: DS(x) > 0} NZ = lim w(E,) = 0. 
c70 


It follows that DS(x) = Oalmost everywhere, and hence, by (8), DS(x) = 0 
almost everywhere. 

Step 2. DA(x) exists almost everywhere and equals g(x). If DA(x) > c 
for all x in a Borel set XY, < X, then 


A(Xo) > cu(Xo). (9) 


In fact, since the integral in (7) is absolutely continuous (see Property d, 
p. 124), given any ¢« > 0, there is a 8 > 0 such that uw(E) < 45 implies 
|A(E)| < ¢. But, according to Lemma I, we can find a set OC Xj 
such that 

u(X — Q)<8, A(Q)> cy(Q). 
It follows that 


A(X) = A(Q) + A(X — Q) > cu(Q) — & > cu(X) — 3 —«, 
which proves (9), since 3 and « are arbitrarily small. In the same way, 
if DA(x) < c for all x € Xy, we can use Lemma 1’ to prove that 

A(Xo) < cu(X). 


Next let 
Ey = (x:a < g(x) < dD}, 


°In writing 6Z, we assume that Z is a Borel set, a restriction that can be dropped by 
using an argument like that given in the remark on p. 212. 


SEC. 10.4 THE DERIVATIVE OF A SET FUNCTION 215 
where g(x) 1s the u-summable function figuring in (7). Then the in- 
equalities - 

DA(x) < 6, DA(x) > a 


hold almost everywhere on E,,. In fact, suppose DA(x) > b ona set 
E < E,, of positive measure. Then, according to (9), 


A(E) > bu(E), 
which is impossible, since 
A(E) =], a()u(dx) < bu(E), 


Similarly, DA(x) < acannot hold ona set E < E,, of positive measure. 


Finally, consider the family of all sets of the form 
Ey yty AE < BCX) < Su}, 
where r, and s, (r,, < s,) are arbitrary rational numbers. As just shown, 
, r, < DA(x) < DA(x) < s, (10) 


on E,,, except on a set Z, , of measure zero. It follows that DA(x) 
exists and equals g(x) everywhere on the set of full measure 


vax—(U 2.) —2 
) 


(rn +Sn 


where Z’ is the set (possibly empty) of measure zero where g(x) takes 
infinite values. In fact, if x ¢ X’, then (10) holds for every pair of rational 
numbers r, and s, such that 


Pn < B(x) < Sy. 
Choosing r, and s, arbitrarily close together, we find that 
DA(x) = DA(x) = DA(x) = g(x), 


and the theorem is proved. 


10.4. Some Consequences of the Lebesgue-Vitali Theorem 


We now examine some of the implications of Theorem |, first for the 
case of nets and then for a family of cubes. 


10.4.1. De Possel’s theorem. We begin by proving 


THEOREM 2. Let X be a set equipped with a measure w, such that 
every set {x} consisting of a single point x € X is measurable, with measure 
zero. Then every net Nt of subsets of X is a Vitali system. 


216 THE DERIVATIVE OF A SET FUNCTION CHAP. 10 


Proof. By definition, St is completely sufficient, which verifies 
Property 1, p. 209. To verify Property 2a, we choose the empty set @ 
as the boundary I°(E) of every set Ec Jt. Then, given any point x € E, 
Ee, let E’ be any other set in 9 containing x. Since two sets E, 
E’eé N are either disjoint (in particular, this is the case if E and E’ are 
of the same rank) or else one of the sets is a proper subset of the other, 
it follows that every set in Jt of sufficiently small measure containing x 
is contained in E. As for Property 2b, we need only note that if x does 
not belong to a given set Ee Yt, then x must belong to some other set 
E’e NR, EE’ — &. Therefore every set A € Nof sufficiently small measure 
containing x belongs to £’, 1.e., cannot intersect E. Finally, to prove 
Property 3, suppose E < X is a set covered by a subfamily BC Yt. 
Then we can always eliminate intersecting sets, by the simple device of 
first choosing all sets of the first rank in 8, then all the sets of the second 
rank in 8 not contained in those already chosen, and so on. This gives 
a sequence of disjoint sets A; (j = I, 2,...) covering E, as required.® 

COROLLARY | (De Possel’s theorem). Let X and N be the same as 
in Theorem 2, and let ®(E) be a countably additive set function on X. 
Then the derivative of D(E) with respect to Nt exists on a set of full measure, 
and coincides with the density of the absolutely continuous component 
of D(E). 

Proof. Apply the Lebesgue-Vitali theorem. 


COROLLARY 2. If F(x) is a function of bounded variation in [a, b], 
then the derivative of F(x) in the sense of formula (2) exists almost every- 
where and equals the density of the absolutely continuous component 


of F(x). 


10.4.2. Lebesgue’s theorem on differentiation of a function of bounded 
variation. Next we consider Vitali systems of a particularly important kind: 


THEOREM 3. Let X be a bounded n-dimensional basic block B, 


equipped with ordinary Lebesgue measure. Then the family Q of all closed 
cubes of the form 


Q={x:a<x,<a,t+h,...,a,<x,<a,+h}CB 


(n > 0) is a Vitali system. 


Proof. It is clear that Properties 1, 2a and 2b, p. 209 are satisfied 
if the boundary I['(Q) of every cube Q € Q Is defined to be its ordinary 
topological boundary (i.e., the intersection of the closure of Q with the 
closure of its complement).!® To verify Property 3, we ‘:se the following 


* Note that any net Jt (and hence any subset of Jt) is countable. 


1°Then Q QI(Q) is the interior of Q and QO = QUI (Q) the closure of Q (as 
anticipated by the notation). 


SEC. 10.4 THE DERIVATIVE OF A SET FUNCTION 217 


proof (due to Banach): Let E < B be a set covered by a subsystem 
8B < Q of cubes such that for any x € E and any ec > 0, there is a cube 
A(x) € 8 with volume s[A,(x)] less than ¢ which contains x. Writing 


k, = sup s(A,), 


AgéB, 


where 8, %, we choose any cube Q, in 8, with volume greater than 
k,/2. If Q, does not cover E to within a set of measure zero, then the 
set B, < B, of cubes disjoint from Q, is nonempty. In this case, we write 


k» = sup s(Aq), 


Ag&Be 


and choose any cube Q, € 8, with volume greater than k,/2. Continuing 
this process, we either eventually manage to cover E (to within a set 
of measure zero) by finitely many cubes in 8, thereby establishing Prop- 
erty 3 at once, or else we obtain sets B = B, > B, > --+, numbers 
k, > k, B --- and disjoint cubes Q,, Q.,... such that 


0,€ 8,, (9) > 2 (j= 1,2,...). 


As we now show, the sequence {Q,} covers £ to within a set of measure 
zero, so that Property 3 holds in any event. 

In fact, let Z be the subset of E such that ZQ, = @ (j= 1,2,...). 
Then, given any integer p > | and any point x, € Z, we have 


Xo € Q; (j=1,...,p—D). 


On the other hand, since x, € £, there are cubes in 8 of arbitrarily small 
volume containing xy. In particular, there is a cube Q € B containing 
X) such that QQ, =::: = QQ,., = @,and hence Q € B,,. If moreover 
OO, =:°:':=QQ,_, = @, then Q belongs toB,,,,..., B,, and hence 
s(Q) < k,. But the series 


Sk; < 23:5(0,) < 2(B) 


i 


converges, so that k; > 0 as j-» oo. Thus there is a first index r > p 
such that Q intersects Q,. Let / denote the side length of Q and /, that 
of Q,. Since OE B, we have s(Q) < k,, 1e, l< Wk,. Moreover 
s(Q,) > k,/2, by construction, and hence /, > WV k,/2. Since Q intersects 
Q,, QO is contained in the cube with the same center as Q, but with side 
length 


L+Uw<h+2vk, <14+ 20721 =(14 202), < 51, 


218 THE DERIVATIVE OF A SET FUNCTION CHAP. 10 


and hence certainly in the cube Q, with the same center as Q, but with 
side length five times that of Q,. It follows that 


and hence 


for every p > 1, since xy is an arbitrary point of Z. But s(Q,) =5%s(Q,) +0 
as j > oo, and hence Z is a set of measure zero. This completes the 
proof. 


CoROLLaRY |. Let D(E) be a countably additive set function on a 
bounded n-dimensional block B, and let xy = (x\%,..., x') be a point 
of B. Then the derivative in the sense 


D®(x) = lim PLOnC%o) I ; 
r0 S[Q,(Xo)] 


where Q,(X9) is the cube™ 


. (0 wt ae 0) - 0 
SS ee Ey nag OO Sie eo x ET}, 
exists almost everywhere and coincides with the density of the absolutely 
continuous component of P(E). 


COROLLARY 2 (Lebesgue’s theorem on differentiation of a function of 
bounded variation). If F(x) is of bounded variation in [a, b], then the 
“ordinary” derivative 

F'(x) = lim 


k-0 


F(x + h) — F(x) 
h 


exists almost everywhere and equals the density of the absolutely con- 
tinuous component of F(x). 


Proof. If @(E) is the set function with generating function F(x), 
then the difference between the quantity P[x, x + h] = D(x, x + hA] + 
@({x}) figuring in Corollary | and the quantity D(x, x + h] = F(x + A) — 
F(x) does not matter, since P({x}) #0 for at most countably many 
points x (why?) and hence ®({x}) = 0 on a set of full measure. 


11 Here, as on p. 206, we allow A < 0 by changing Q,(x») to 
fx: xl +h xy < xl), x +h xy < xO} 


and ®[Q,(x9)] to —®[{Q,(x»)]. 


SEC. 10.4 THE DERIVATIVE OF A SET FUNCTION 219 


CorOLiaryY 3. Jf F(x) is absolutely continuous in [a, b], then F’(x) 
exists almost everywhere. Moreover, F(x) is summable and” 


F(x) — F(a) =} F'(@) a. (11) 


CoROLLARY 4. If F(x) is singular in [a, b], then F'(x) = 0 almost 
everywhere. 
CorOLiary 5. If F(x) is of bounded variation in [a, b] and if F'(x) = 0 
almost everywhere, then F(x) is singular in [a, b]. 
Proof. Writing F(x) in the form 
F(x) = A(x) + S(x), 
where A(x) is absolutely continuous and S(x) is singular, we use Corollary 


4 to deduce that A’(x) = 0 almost everywhere. But then the absolutely 
continuous set function 


A(E) =| A'(x) dx 


vanishes on every Borel set FE, and hence A(x) = A[a, x] =0, ie., 
F(x) = S(x), as asserted. 


Remark I. As shown in Prob. 3, p. 223, there exist functions whose 


derivatives vanish almost everywhere but which are not of bounded variation 
(and hence not singular). 


Remark 2. Let F(x) be a generating function of bounded variation in 
[a, b], with decomposition 
F(x) = A(x) + S(x), 


where A(x) is absolutely continuous and S(x) is singular. Then A(x) and 
S(x) can be found from F(x) itself by using the formulas 


Ax) =|" Fae, S(x) = FX) [7 FO ae, 

THEOREM 4. Jf F(x) is a generating function of bounded variation 
in [a, b], with positive variation P(x), negative variation Q(x) and total 
variation V(x), then 

PO=([FOr, OWO)=(FOYr, VO=IF@) (2) 


almost everywhere. Moreover, 


Px) =f POI dé, Ox =f Ord, Voy =] Fela, 
(13) 


if F(x) is absolutely continuous in {a, 5}. 


12 Formula (11) is a far-reaching generalization of the classical relation between definite 
integrals and primitives. 


220 THE DERIVATIVE OF A SET FUNCTION CHAP. 10 


Proof. Let ®(E) be the set function with generating function F(x), 
and let X¥* and X~ be the two sets figuring in the corresponding Hahn 
decomposition of X¥ = [a, b] (see Theorem 10, p. 163). Then O(E) > 0 
on every Borel set E © X*. But then F(x) > 0 almost everywhere on X*, 
since otherwise F(x) << c<0ona set E, © X* of positive measure, 
and then according to Lemma 1’, p. 212, there would be a set O < Eg 
of positive measure such that ®(Q) < —cu(Q), which is impossible. 
Similarly, F’(x) < 0 almost everywhere on X~. 

Next let 

O(E) = p(E) — q(£) 

be the representation of ®(£) in terms of its positive variation p(E) and 
negative variation g(£) [see Theorem 8, p. 160]. Then P(x) is the gener- 
ating function of p(E) and Q(x) the generating function of qg(£). Since 
q(E) vanishes on X* and all its Borel subsets, we have Q’(x) = 0 almost 
everywhere on X*, by substantially the same argument used to prove 
that F’(x) = 0 on X°, and similarly P’(x) = 0 almost everywhere on X~. 
Since F(x) — P(x) — Q’(x), it follows that F(x) = P’(x) almost every- 
where on X~ and F(x) — Q'(x) almost everywhere on X~-. Together 
with the formula V’(x) = P(x) Q(x) and the properties of F’(x) 
just proved, this implies (12). Moreover, if F(x) is absolutely continuous 
in (a, bj, then so are the functions P(x), Q(x) and V(x) [see p. 184,] 
which therefore equal the integrals of their own derivatives, by Corollary 
3, thereby proving (13). 


10.5. Differentiation with Respect to the Underlying o-Ring 


To avoid repetition, we establish the convention that throughout this 
section X is a set equipped with a o-ring Xl of Borel sets E < X, a Borel 
measure uw, a summable function 9(x) and a Vitali system ¥ of sets EE XU. 


DEFINITION 1. A point x9 € X is said to be a Lebesgue point of o(x) 
(relative to V ) if 
lim 

oo MLAA%o)] 40) 


lp(x) — 9(X9)| u(dx) = 0, (14) 


where A(X) is any Vitali set of measure less than < containing Xp. 
THEOREM 5. A/most every point of X is a Lebesgue point of 9(x). 


Proof. Given any real number r, it follows from the Lebesgue- Vitali 
theorem that 


| 19x) — r] w(dx) = |e(x9) — 71 (15) 


(29) 


lim ———— 
e-+0 ULAL(%o)] 4: 


SEC. 10.5 THE DERIVATIVE OF A SET FUNCTION 22] 


for all x, in a set E, of full measure. Then the set 


E=f)E,,, 
n=] 
where r,, ranges over all the rational numbers, is also of full measure. 
Let xp be a point of E such that ¢(x,) is finite. Then, given any 6 > 0, 
there is a rational number such that |9(x») — r| < 6/3, and hence 


J 
ee P d 
af Gold, $(X)| u(dx) 


1 


< TAG, [ le(x) — rl w(dx) + 


2 (2p) 


[Ir — e60)1 wax) 


e (Zo) 


a 
ULACXo)] , 


i 
= {+ f tox) — A alae) — fol) — ri} + 2 19%) — v1 
HAC) atop 

(16) 
Because of (15), the term in braces can be made less than 8/3 for suffh- 
ciently small «. Therefore the left-hand side of (16) can be made less 
than 3 for sufficiently small «. This proves the theorem, since ¢(x) is 
finite almost everywhere. 


Next we generalize the notion of “regular convergence,” already en- 
countered on p. 207: 


DEFINITION 2. A sequence E,, E,,... of Borel sets is said to converge 
regularly to a point x, € X if 


]) Every E,, is contained in a Vitali set A, such that xy9€ A, and 
u(A,) ~0asn— ~; 
2) There is a fixed constant c > 0 such that 


u(E,,) > cu(A,) 
for every n. 


We are now in a position to generalize differentiation in the sense of 
formula (3), p. 207: 


DEFINITION 3. Let M(E) be a countably additive set function defined 
on the o-ring U (and hence on the Vitali system V ). Then by the derivative 
of D(E) at the point xy € X with respect to XU is meant the quantity 


. OE, 
Dy®(Xo) = lim OE.) 
edo u(E,,) 
(provided the limit exists), where E,, E,,... is any sequence of Borel sets 


regularly converging to Xo. 


222 THE DERIVATIVE OF A SET FUNCTION CHAP. 10 


THEOREM 6. Jf £,, E,,... is a sequence of Borel sets converging 
regularly to a Lebesgue point xy of 9(x), then 


pois exe x, Px)u(dx) = 9%). 


Proof. We need only note that 


oy eee _ 
| ole) — Je —— |, eboutas)| = | —e [,.{9(%0) — e@)u(dx) 
1 
< je, !90) — obo ald 
rr a P(x) — 9(x0)| w(dx) +0 as n> oo. 


COROLLARY. Let 
O(E) =| o(x)u(dx) 
be the “indefinite integral” of 9(x). Then 
Dy ®(Xo) = Xo) 
at every Lebesgue point xy of 9(x). 


THEOREM 7. If O(E) is a countably additive set function on X, then 
the derivative of P(E) with respect to the underlying o-ring U exists on a 
set of full measure, and coincides with the density of the absolutely con- 
tinuous component of P(E). 


Proof. \f P(E) is absolutely continuous, the result follows at once 
from Theorem 5 and the corollary to Theorem 6. Thus suppose P(£) is 
singular and nonnegative (the latter assumption entails no loss of gener- 
ality), and let x, be a point at which the derivative D,-O(£) with respect 
to the Vitali system ¥ exists and equals zero. As we know from Theorem 
1, such points form a set of full measure. If F,, E,,... 1s a sequence of 
Borel sets converging regularly to x9, then u(E,) > cu(A,) and hence 


(En) — ting 1m) _ 1 


D, P(x) = lim —— - Dy-®(x») = 9, 
c 


no U(E,) n+0¢ u(A,) 
where the A,, are suitable Vitali sets. The theorem now follows from the 
observation that a general set function P(E) is the sum of its absolutely 
continuous and singular components.** 


13 It should be kept in mind that the singular component of ®(£), unlike its absolutely 
continuous component, may not be defined on all Lebesgue-measurable subsets E © X 
(recall Prob. 10, p. 204). 


SEC. 10.5 THE DERIVATIVE OF A SET FUNCTION 223 


CoROLLARY. Jf F(x) = ®[a, x] is of bounded variation in [a, b}, then 
the derivative in the sense 


. @ 
F’(Xo) = lim (E,,) 5 
ee U(E.,) 
where E,, E,,... is a sequence of Borel sets converging regularly to xo,"4 


exists almost everywhere and equals the absolutely continuous component 


of F(x). 
PROBLEMS 


1. Construct a function F(x) with Properties a and b listed on p. 207. 


Hint. As n-— o the right-hand end points of the intervals p2-? <x < 
(p + 1)2 ” containing x, form a strictly decreasing sequence {x,,} converging 
to x9, where obviously 
Xn Xo 


Xn+1 — X <5 


4 


Let F(x) be a continuous function equal to zero at the points x,, and linear 
in the intervals (%,41, &n), (En, X,) where 


,- =, 


and let F(x) take the value &, — x, at the midpoint €,. Then F(x) clearly has 
no derivative in the ordinary sense at x9. Verify that F(x) is of bounded variation. 


2. Show that the function 
] 
F(x) = x* sin- 
x 


fails to have a derivative in the sense of formula (3), p. 207 at the point x) = 0, 
although its ordinary derivative exists and equals zero at Xp. 


Hint. Choose E,, to be the set of intervals 


1 1 
(n) (Nc a 
x b > 
ap.apie (25 
on which F(x) increases. 
3. The function 
] 
F(x) = x sin- 
(x) - 
14 In particular, E,, E,,...can be any sequence of Borel sets such that 1) every E,, is 


contained in the interval (x9 -- An, Xo + Ap], where 4, > 0 as n— o, and 2) there is a 
fixed constant c > 0 such that u(E,) > 2ch, for every n (cf. p. 207). 


224 THE DERIVATIVE OF A SET FUNCTION CHAP. 10 


is not of bounded variation in [—7=, =]. By replacing F(x) on its intervals of 
monotonicity by a function of the Cantor type (see Prob. 2, p. 86), construct 
a continuous function whose derivative vanishes almost everywhere but which 
is not of bounded variation (and hence not singular). 


4 (Fubini’s convergence theorem). Given a o-ring Y, let 


wm 
» ®,(E) 
nel 
be a series of nonnegative countably additive set functions converging on every 


set E € toa finite function P(E), which is itself countably additive by Prob. 
5, p. 203. Prove that 


5 Do, (x) = D(x) 


n=1 
almost everywhere. 


Hint. On the set of full measure where all the derivatives D®(x), D®,(x), 
D®,{x), ... exist, take the limit of the inequality 


~ PnlA(xo)] — PLAG)] 
1 HLA) LAG)’ 
obtaining 


) D®,(x) < D(x). 


n=l 


Given any integer k > 0, let N, be such that 


Nx 1 
®(X) — > ®,(X) < xk 


n=l 


Then the series with general term 


Nk 
¥,(E) = (E) — 2 ®,(E) 
n=1 
converges for every E'€ U, and the series with general term 


DY,,{x) = D®(x) — s D®,(x) 


n=1 


converges on a Set of full measure. Therefore 


Ne 

> D®,(x) > D®(X), 

k=1 
and hence 

N 
D®,(x) > D(x). 

n=-1 
5. Given a summable nonnegative function g(x) defined on a set X equipped 
with a Lebesgue measure u, define a new countably additive set function G(E) — 


I(gXxn). Regarding G(E) as a new elementary measure, extend G(E) to a new 


PROBLEMS THE DERIVATIVE OF A SET FUNCTION 225 


Lebesgue measure, and construct the corresponding space of summable func- 
tions L;;. Show that a function (x) belongs to L;; if and only if og belongs 
to L, and moreover 


Ig? = I(¢g). 
Hint, Generalize the considerations of Example 2, pp. 90, 145. 


6.1° Let X be an arbitrary set equipped with a measure u, and let 8 be a family 
of Borel subsets of which covers a given set E © X and has the following 
properties: 


1) Given any x € E and any « = 0, there is a set in 8 of measure less than 
¢ which contains x; 

2) If x € Ap, Ap EB, then every set in B of sufficiently small measure 
containing x does not intersect Ap; 

3) There exist two positive numbers a and b such that 


u(A) <a 
AAg# 2 


( U A) < bat 
Prove that E can be covered to within a set of measure zero by countably 
many disjoint sets A; € B. 


Hint. The proof resembles that of Theorem 3, p. 216, but use of Property 
3 replaces the argument involving the side lengths of the cubes Q and Q,. 


7. Let X and Y be two sets, equipped with (nonnegative) measures pu and vy, 
respectively, and suppose y ~ f(x) is a one-to-one mapping of X onto Y, or 
at least ‘‘almost one-to-one” (i.e., one-to-one after deleting a set of u-measure 
zero from X and a set of v-measure zero from Y). Moreover, suppose the mapping 
y ~ f(x) carries measurable sets into measurable sets and sets of u-measure 
zero into sets of v-measure zero. Let ¥ be a Vitali system of subsets of X, and 
define the ‘‘Jacobian of the transformation’’ by the formula 


9 (2) = tim ar (17) 


x e—+0 U[A.(%9)] ‘ 


where A,(x,) is any Vitali set of measure less than « containing the point xp. 
Show that the function (17) exists almost everywhere on X and is summable on 
every summable set E < X. 


Hint. The function ®(E) — v[f(E)) is countably additive and absolutely 
continuous. 


15 Due to V. A. Yashnikov. 


226 THE DERIVATIVE OF A SET FUNCTION CHAP. 10 


8. With the same notation as in the preceding problem, prove the validity of 
of the formula for ‘‘changing variables”’ 


[envy =[. al fona(? u(dx), 


where the existence of either side implies that of the other. 


Hint. Introduce the new measure 
O(E) = vIf(E)] = f,2(2 u(dx) 


on all summable subsets E < X, and then use Prob. 5, p. 224. 


BIBLIOGRAPHY 


Berberian, S. K., Measure and Integration, The Macmillan Co., New York (1965). 

Burkill, J. C,, The Lebesgue Integral, Cambridge University Press, London (1953). 

Dunford, N. and J. T. Schwartz, Linear Operators, Part I; General Theory, Inter- 
science Publishers, Inc., New York (1958). 

Goffman, C., Real Functions, Holt, Rinehart and Winston, Inc., New York (1953). 

Graves, L. M., The Theory of Functions of Real Variables, second edition, McGraw- 
Hill Book Co., New York (1956). 

Hahn, H. and A. Rosenthal, Set Functions, University of New Mexico Press, 
Albuquerque, New Mexico (1948). 

Halmos, P. R., Measure Theory, D. Van Nostrand Co., Inc., Princeton, N.J. (1950). 

Hartman, S. and J. Mikusinski, The Theory of Lebesgue Measure and Integration 
(translated by L. F. Boron), Pergamon Press, Oxford (1961). 

Hewitt, E. and K. Stromberg, Real and Abstract Analysis, Springer-Verlag, Inc., 
New York (1965). 

Hildebrandt, T. H., Introduction to the Theory of Integration, Academic Press, Inc., 
New York (1963). 

Jeffery, R. L., The Theory of Functions of a Real Variable, second edition, University 
of Toronto Press, Toronto (1953). 

Kestelman, H., Modern Theories of Integration, second revised edition, Dover 
Publications, Inc., New York (1960). 

Kolmogorov, A. N. and S. V. Fomin, Measure, Lebesgue Integrals, and Hilbert 
Space (translated by N. A. Brunswick and A. Jeffrey), Academic Press, Inc., New 
York (1961). 

Loomis, L. H., An Introduction to Abstract Harmonic Analysis, D. Van Nostrand 
Co., Inc., Princeton, N.J. (1953). 

McShane, E. J., Integration, Princeton University Press, Princeton, N.J. (1944). 

McShane, E. J. and T. A. Botts, Rea/ Analysis, D. Van Nostrand Co., Inc., Prince- 
ton, N.J. (1959). . 


227 


228 BIBLIOGRAPHY 


Munroe, M. E., Introduction to Measure and Integration, Addison-Wesley Publish- 
ing Co., Inc., Reading, Mass. (1953). 

Natanson, I. P., Theory of Functions of a Real Variable (translated by L. F. Boron, 
with the collaboration of E. Hewitt), Frederick Ungar Publishing Co., New York. 
Volume I (1955), Volume II (1960). 

Riesz, F. and B. Sz.-Nagy, Functional Analysis (translated by L. F. Boron), Frederick 
Ungar Publishing Co., New York (1955). 

Rogosinski, W., Volume and Integral, Interscience Publishers, Inc., New York 
(1962). 

Royden, H. L., Real Analysis, The Macmillan Co., New York (1963). 

Saks, S., Theory of the Integral (translated by L. C. Young, with two notes by S. 
Banach), second revised edition, Dover Publications, Inc., New York (1964). 

Sz.-Nagy, B., Introduction to Real Functions and Orthogonal Expansions, Oxford 
University Press, New York (1965). 

Taylor, A. E., General Theory of Functions and Integration, Blaisdell Publishing 
Co., New York (1965). 

Titchmarsh, E. C., The Theory of Functions, second edition, Oxford University 
Press, New York (1939). 

Von Neumann, J., Functional Operators, Volume I: Measures and Integrals, Prince- 
ton University Press, Princeton, N.J. (1950). 

Williamson, J. H., Lebesgue Integration, Holt, Rinehart and Winston, Inc., New 
York (1962). 


Zaanen, A. C., An Introduction to the Theory of Integration, Interscience Publishers 
Inc., New York (1958). 


INDEX 


A 


Absolute continuity of the integral on a 


set, 124 


Absolutely continuous point function, 196 


Absolutely continuous set function, 184 
density of, 205 
generating function of, 198 
Absolutely monotonic function, 80 
Adjacent intervals, 22 
Almost all, 14, 25 
Almost everywhere, 14, 25 
Apostol, T. M., 203 
Axiomatic measure theory, 3, 150-180 


B 


Banach, S., 179, 187, 217 
Basic block, 7, 61 

improper boundary of, 62 

lower boundary of, 62 

kth sheet of, 62 
subblock of, 62 
upper boundary of, 62 
Ath sheet of, 62 

Bernstein’s theorem, 78 
Block(s), 3, 7, 61, 169 

basic (see Basic block) 

coverings by, 13 

dense set of, 63 

downward convergence of, 71, 94 

partition of, 7 

size of, 7 

strict inclusion of, 93 

volume of, 7 
Bochner-Khinchin theorem, 80 
Bois-Reymond, P. du, 8 
Borel function, 149 


Borel tneasure, 152 

generalized, 152 

signed (see Signed Borel measure) 
Borel set(s), 145 ff. 

abstract, 152 

classical, 145 

generalized, 152 

regularly convergent, 207, 221 


C 


Cantor function, 86, 203 
Cantor set, 21, 86, 148, 203 
Cartesian product, 40 
Cauchy, A. L., 1, 8 
Cauchy criterion, 38 
Cauchy sequence, 38 
Class L, 29 ff. 
completeness of, 39 
integration in, 30 
operations in, 30 
Class L+, 26 ff. 
integration in, 27 
properties of integral in, 28-29 
Compact metric space, 166 
Completely monotonic function, 77 
Consistency condition, 169 
Constructive measure theory, 3, 134-149 
Continuity axiom, 24 
Continuous linear functionals: 
on C(B), 94-96 
representation of, 95 
on C(X), 166-167 
representation of, 166 
on L(X), 190-192 
representation of, 190 
on L,(X), 192-194 
representation of, 192 
Continuum hypothesis, 57 


229 


Convergence in measure, 132 
Countable additivity, 117 
Cubes as a Vitali system, 216 
Cylinder set, 168 

base of. 168 


D 


aniell, P. J., 8, 30 
Darboux, J. G., 8 
Darboux sum: 
lower, 2,9, 17 
upper, 9, 17 
Davis, J. D., 80 
De Possel's theorem, 4, 208, 216 
Derivative, 183-226 
lower, 209 
upper, 209 
with respect to a net, 208 
with respect to a o-ring, 221 
with respect to a Vitali system, 209 
Dini's lemma, 54 
Dirichlet, P. G. L.. 8 
Dirichlet function, 8 
Discrete set function, 184 
Downward convergence, 71, 94 
Dunford. N.. 168 


E 


Egorov’s theorem, 131 
Elementary functions, 2, 3, 21, 23 ff. 
continuous functions as, 54, 84 
step functions as, $0, 105 
Elementary integral, 3, 24 ff. 
Riemann integral as, 54 
Riemann-Stieltjes integral as, 88 
Elementary measure, 150 
Borel extensions of, 157 
Lebesgue extensions of, 156 
Elementary sets, 150 
Essential convergence of quasi-volumes, 


ae) 


1 me 


Essentially bounded function, 190 
Extension, !70 


F 


Fatou’s lemma, 37 
Finite subcovering lemma. 13 


Finitely-Lebesgue measure, 152 
Fréchet, M., 8 
Fubini’s convergence theorem, 224 
Fubini’s theorem, 40—44, 125-126 
for functions of several real variables, 
$2-53 
Full measure, set of, 13, 25 
Function of bounded variation, 65 
differentiation of, 218 
Functionals of variable sign, 44—49 
canonical representation of, 48, 103 
relation to canonical representation 
of quasi-volumes, 103 
representation of, 44 
Fundamental sequence (see Cauchy se- 
quence) 


G 


Generalized Borel measure, 152 
Generalized Borel set, 152 
Generating function(s), 64, 66 

absolutely continuous, 198 

positive, negative and total variations 

of, 85 
sequence of, 75 
essentially convergent, 75 
singular, 199 


H 


Hahn decomposition, 163 
Hahn-Banach theorem, 109 
Hausdorff, F., 148 

Hausdorff space, 168 

Heine, E., 8 

Heine-Borel theorem, 13 

Helly’s convergence theorem, 72 
Helly's selection principle, 74 
Herglotz’s theorem, 76 

Holder’s inequality, 127 


Infinite-dimensional cube, 168 
natural topology of, 168 
quasi-volumes on, 169 

Lnner measure, 147 


J 


Jordan’s theorem, 85 
Jump function, 201 


K 


Kolmogorov’s theorem, 172 
Korenblyum, B. I., 77 
Kuratowski, C., 179 


L 


Lebesgue, H., 2, 3, 8, 30, 121, 122, 147 
Lebesgue-integrable function, 29 
Lebesgue-measurable set, 145 
Lebesgue-Stieltjes integrable function, 89 
Lebesgue-Stieltjes integral, 88—109 
on an infinite-dimensional space, 167-— 
178 
with respect to a signed quasi-volume, 
93 
Lebesgue-Vitali theorem, 208, 213 
consequences of, 215-220 
Lebesgue decomposition, 202 
Lebesgue integral, 30 ff. 
as defined by Lebesgue, 122 
in n-space, 50—57 
relation to Riemann integral, 50-51 
Lebesgue measure, 145, 152 
construction of the integral from, 158 
for n = 1, 147 
Lebesgue point, 220 
Lebesgue’s (bounded convergence) the- 
orem, 36 
Lebesgue’s criterion for Riemann inte- 
grability, 8, 20 
Lebesgue’s theorem on differentiation of 
a function of bounded variation, 
218 
Levi’s theorem, 32 
Limit inferior, 19 
Limit superior, 19 
Limiting value, 19 
Loomis, L. H., 30 
Lower derivative, 209 
Lower function, 18 
invariant definition of, 19 


INDEX 231 


Lower integral, 10 
Lusternik, L. A., 109 
Luzin, N. N., 149 


M 


Markushevich, A. I., 76 
Measurability criterion, 142 
Measurable functions, 37, 113 ff. 
characterization in terms of measure, 
120 
Measurable set, 116 ff. 
measure of, 117 
Measurable subset, 123 
absolutely continuity of integral on, 
124 
function measurable on, 123 
function summable on, 123 
integration over, 123 
Measure zero, set of, 13, 24 
Measure(s), 113 ff. 
Borel, 152 
countable additivity of, 117 
countably additive, 150 
elementary, 150 
finitely-Lebesgue, 152 
in n-space, 143-147 
inner, 147 
intersection of, 156 
Lebesgue, 145, 152 
on a product space, 125-126 
outer, 141 
relation to quasi-volumes, 162-163 
signed Borel, 159 


N 


Negative part, 12, 23 
Negative variation, 84, 85, 161 
Net, 208 
derivative with respect to, 208 
as a Vitali system, 215 
Nonmeasurable function, 57 
Nonmeasurable set, 178 
Nonnegativity axiom, 24 
Norm, 38 
Normed linear space, 38, 127 
complete, 38, 127 


232 INDEX 


O 


w-measurable set, 172 
structure of, 173-178 

w-summable set, 172 

w-summable function, 172 
structure of, 173-178 


Outer measure, 141 


P 


Partition, 7 

refinement of, 9 
Positive definite function, 80 
Positive part, 12, 23 
Positive variation, 84, 85, 161 
Projection, 170 


Q 


Quasi-length(s), 64 ff. 
continuous, 102 
equivalent, 105 


generating function (distribution func- 


tion) of, 64 
positive and negative variations of, 
85 
total variation of, 65, 85 
Quasi-volume(s), 3, 63 ff. 

additivity of, 63 
continuous, 99-103 

at infinity, 69 

on the empty set, 100 
equicontinuous at infinity, 73 
equivalent, 71, 103-105 
essential convergence of, 72 
generating function of, 66 
nonnegative, 63 
of bounded variation, 63 
relation to measure theory, 162-163 
signed (see Signed quasi-volume ) 
total variation of, 63 

Quasi-voiume a, 95 

relation to o, 96-99 


R 


Radon, J. R., 8 
Radon-Nikodym theorem, 4, 189 


Radon-Nikodym theorem (cont.): 
consequences of, 190-194 
Rectangular parallelepiped, 7 
Rectifiable curve, 86 
Regular convergence, 207, 221 
Riemann, B., 1, 8 
Riemann integral, 7 ff. 
definition of, 8 
double, 40 
improper, 51-52 
lower, 10 
relation to Lebesgue integral, 50-52 
upper, 10 
Riemann sum, 8 
Riemann-Stieltjes integrable function, 66 
Riemann-Stieltjes integral, 61-87 
construction of, 66 
for an infinite basic block, 69 
properties of, 68 
Riemann-Stieltjes sum, 66 
Riesz-Fischer theorem, 39 
generalization to L,, 128 
Riesz’s representation theorem, 44 
Ring, 150 


Saks, S., 187 

Schwarz’s formula, 76 

Schwartz, J. T., 168 

Semiring(s), 63, 134 ff. 
completely sufficient, 139-141 
of summable subsets, 136 

subspace generated by, 136 

properties of, 134-135 
sufficient, 136-139 

Set function(s), 183 ff. 
absolutely continuous, 184 
classification of, 183-184 
concentrated, 183 
continuous, 184 

decomposition of, 187 

decomposition of, 185 
derivatives of, 20S—226 
discrete, 184 
singular, 184 
variations of the sum of, 194-195 

Sets of rank n, 208 

Sheet, 13, 62 

Sheet of discontinuity, 106 


Shilov, G. E., 80 
o-measurable function, 144 
o-measure, 144 
o-measure zero, set of, 89 
o-ring, 151 
o-summable function, 89, 144 
0,-ring, 152 
=-ring, 152 
2yting, 152 
Signed Borel measure, 159-162 
positive, negative and total variations 
of, 161 
representation of, 160 
canonical, 161 
Signed quasi-volume, 63, 81-85 
Lebesgue-Stieltjes integration with re- 
spect to, 93 
positive, negative and total variations 
of, 84 
representation of, 81 
canonical, 83, 103 
Silverman, R. A., 76 
Simple set, 141 
Singular set function, 184 
Smolyanov, O. G., 180 
Space L . (X), 172 
Space Ly 126-131 
Step functions, 2, 11-13, 15~—17 
as elementary functions, 50, 54 
definition of, 11 
integral of, 12 
lower, 17 
upper, 17 
Stieltjes integral, 61-109 
Stone’s axioms, 119 
Subblock, 62 
Summable function, 29 
Summable set, 117 
measure of, 117 


INDEX 233 


T 


T-dimensional cube (see Infinite-dimen- 
sional cube) 

Tolstov, G. P., 57 

Topological space, 168 

Total variation, 63, 65, 84, 85, 161 

Triangle inequality, 38, 128 

Tychonoff’s theorem, 168 


U 


Upper derivative, 209 
Upper function, 18 

invariant definition of, 19 
Upper integral, 10 


V 


Vitali set, 209 
boundary of, 209 
Vitali system, 209 
derivative with respect to, 209 


W 


Wallace, D. A. R., 80 
Well-ordering hypothesis, 57 


Y 


Yashnikov, V.A., 225 
Young, L. C., 187 
Yunovich, Y. N., 213 


INTEGRAL, 
MEASURE & DERIVATIVE: 
A UNIFIED APPROACH 


= 


G.E Shilov&B'L.Gurevich *"» 


This is a volume in Richard Silverman’s exceptional series of 
translations of outstanding Russian mathematical texts. In it he 
has produced, with the cooperation of the original authors, a book 
that is uniquely accessible and useful to English language readers. 


The authors approach the subject of “functions of a real variable” 
through the consistent use of the Daniell scheme, offering a ‘rare 
and useful alternate to regular approaches. They start from the 
concept of an elementary integral defined (axiomatically) on a 
family of elementary functions, rather than from preliminary: 
construction of a theory of measure, thereby getting to the* crux of 
the matter quickly and directly. 


Part one is devoted to the integral, moving from the Reimann 
integral and step functions to a general theory, and obtaining the 
“classical” Lebesgue integral in » space. In part two the Lebesgue- 
Stieltjes integral is constructed through the Daniell scheme with 
the Reimann-Stieltjes integral used as the elementary integral. In 
part three the general Daniell scheme is used to develop the 
theory of measure which now appears as a natural and” almost 
self-evident consequence of the theory of’ the integral; more ‘ad- 
vanced chapters lead to a deeper study of constructive measure 
theory and axiomatic measure theory. Part four, devoted to the, 
theory of the derivative, first develops the Radon- -Nikodym theorem’ 
and the Lebesgue decomposition. Then the derivative is defified 
using differentiation with respect to a net, with respect to a Vitali 
system, and with respect to the class of all Borel subsets of X. 


Many problems (with hints and answers) are supplied. While ° 


normally used for graduate courses in ‘functions of a real vari- 
able” this book can be understood by any reader with a good 
background in advanced calculus and with sufficient “mathematical 
maturity.” It offers an excellent alternate point of view for normal 
course work, or can be used for supplementary work and self-study. 


Unabridged republication of original (1966) edition. Index. Eng- 
lish bibliography. xiv + 233pp. 554 x 814. 63519-8 Paperbound 


@¥, $4.50 in U.S.A. 


ne 


‘af ‘uojj!1g °A punwpq Aq usIsaq 13aA05 


